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Foreword 


I know that I speak for both old and new devotees of bacte¬ 
riophage when I say that this book has been wanted and 
needed for a very long time. An entire generation of graduate 
students has gone on to become department chairs since the 
last version of The Bacteriophages. In a real sense, the field of 
bacteriophage biology died, was buried and plowed under, 
but is now arising again as vigorous fresh green shoots 
from the soil so thoroughly enriched. The evidence of its 
death is indisputable. If you do a search of the NIH CRISP 
database for grants with “bacteriophage” in the title in the 
years of 1972 and 2002, the numbers come up approx¬ 
imately the same, above 200. However, a closer look reveals 
that most of the projects funded by NIH in 1972 involved 
research in which plaques were generated every week in 
the natural course of experimentation. When I looked 
through the grants funded in 2002, and put aside phage 
display and other library implementations, I could find 
fewer than 20 and I know for certain that several of these 
have expired since then. In the 1970s, the Cold Spring 
Harbor “Phage” meetings were basically all about some 
aspect of bacteriophage biology; now, the meeting is still 
affectionately called “Phage" but I can tell you as an organi¬ 
zer that it is a struggle to fill up even one evening session 
of a six-day conference with vaguely phage-related talks. 

It should be a topic of some interest for science historians 
to explain how a field with so much momentum and so many 
talented practitioners suddenly turned its own lights off and 
just walked out the door. It was an exodus of talent and 
leadership of a scale, breadth, and suddenness never seen 
before in any field of biology, and perhaps in any field of 
modern science. My own theory is that the classical era of 
phage biology had at its core a suicidal impulse derived 
from physicists' reductionism. Others have suggested to me 
that the very success of molecular genetics, much of which 
was concerned with phages during the golden era, led to a 
kind of arrogance of invincibility and thus to a fearless rush 


to harvest the low hanging fruit in eukaryotic systems. 
Perhaps it also was alluring to be in the founding circle for 
new study sections, where presumably the competition 
would be much less intense. 

In any case, there is a new phage biology emerging. In this 
new phage biology, the interest is in the phages themselves, 
not just in phages as a convenient system to learn new rules 
of molecular biology. The latter impulse is still alive, but its 
few remaining active adherents are mostly at or nearing 
retirement or bypass age. The new crowd of phage people 
come from all directions, not always intentionally. Phage 
are now being rediscovered as marvelous subjects for nano¬ 
science (hardly a surprise to any kid who has ever seen a 
drawing of phage T4!). It also turns out, mirabile dicta, that 
phage are involved in many aspects of bacterial evolution 
and pathogenesis. Indeed, many diseases and most dissemi¬ 
nation of virulence factors are basically phage phenomena, 
despite the decades-long aversion of funding agencies to 
consider phage as relevant to human disease. Moreover, 
phage are now being found to be sources of genetic 
information useful in combating drug-resistant pathogens, 
which should have been obvious long ago. In fact, much 
of this volume is written by members of the new wave. 
And, not least, phage are now being tamed and harnessed 
themselves as therapeutic agents, more than half a century 
after d’Herelle's lonely, ostracized demise. 

Which brings us back to this book. Have you ever tried to 
find an up-to-date, comprehensive compendium of phage 
biology? Well, until now, you had very few choices, and 
most of them were out of print. As you will see, many of the 
chapters of this book are written by recognizable veterans of 
the classical phage years, but also many are written by new 
practitioners, some of whom didn’t arrive at this juncture 
intentionally. They simply followed the science, and the 
science of microbiology is now coming back, full circle, to 
bacteriophage. 


Ry Young 
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Preface 


The idea of creating a book about the bacteriophages was 
conceived by the late Heinz Fraenkel-Conrat in his role as 
an editor of a series called The Viruses. The first edition of 
The Bacteriophages was published by Plenum Press in 1988 
as two volumes. Its manager, Kirk Jensen, arranged for 
Oxford University Press to sponsor this second edition, and 
the new project was managed by Peter Prescott. Much has 
happened in the phage world since 1988. Horizontal gene 
transfer, which was just beginning to find acceptance then, 
has become an established issue today. Observation of 
the packaging of single DNA molecules and measurement 
of the force of the packaging motor is a recent happening. 
Phages with linear plasmid prophages were unknown 
in 1988. At that time, no one would have thought that 
a filamentous phage genome would integrate into a host 
genome. We have included articles on these subjects in 
this edition. 

We aim to provide a current guide to each of the major 
phage families and to provide a general description of the 
kinds of phages that are associated with the major classes of 
eubacteria and archaea. In addition, we wish to highlight 
interesting current topics that are relevant to many of the 
phage families. I have been asked on several occasions to 
advise colleagues on the control of phage infections in 
bacterial cultures and fermentation facilities. To answer 
this need, Gregg Bogosian of Monsanto Corporation has 
provided a description of how phage control is achieved 
where its economic impact is high. Phages have been used 


widely to display antigens, and Bjorn Lindqvist has sum¬ 
marized the state of this field. Due to renewed interest in 
phage therapy, Carl Merril, Dean Scholl, and Sankar Adhya 
have contributed an article on the status and prospects of 
this art. 

The latter stages of producing this volume were greatly 
aided by Steve Abedon (see phage.org, the google.com 
“phage ecology” I'm feeling lucky site). Steve came to 
The Bacteriophages first as an author (chapter 5, Phage 
Ecology) and then as developer of the companion web 
site (thebacteriophages.org). In the course of the latter he 
became intimately involved in the formatting and refine¬ 
ment of the figures (all of which may be found, some with 
color, at thebacteriophages.org). This activity led to editing 
of the figure legends, and then to editing of all 48 chapters. 
Closer to the beginning, Hans Ackermann suggested the 
order of presentation of articles. 

This volume is dedicated to phage workers who have passed 
on recently: Gisela Mosig, who elucidated the diversity of 
DNA replication mechanisms used by coliphageT4; Edouard 
Kellenberger, whose expertise in electron microscopy led to 
the discovery of X phage proheads, as well as to the superior 
electron microscope facilities at the European Molecular 
Biology Organization Laboratory: and Wolfram Zillig, who 
traveled the globe by airplane, foot, and cross-country skis 
to collect Archaea and show that they could release virus 
particles. 



This page intentionally left blank 



Contents 


Foreword by Ry Young v 

Contributors xi 

PART I. General Background of Phage Biology 

1. Phage and the Early Development 

of Molecular Biology 3 

William C. Summers 

2. Classification of Bacteriophages 8 

Hans-W. Ackermann 

3. Prophage Genomics 17 

Harald Briissow 

4. Evolution of Tailed Phages: Insights 

from Comparative Phage Genomics 26 

Harald Briissow and Frank Desiere 

5. Phage Ecology 37 

Stephen T. Abedon 

PART II. Life of Phages 

6. DNA Packaging in Double-Stranded DNA Phages 49 


Paul J. Jardine and Dwight L. Anderson 

7. General Aspects of Lysogeny 66 

Allan Campbell 

8. Gene Regulatory Circuitry of Phage X 74 

JohnW. Little 

9. Regulation of X Gene Expression by Transcrip¬ 
tion Termination and Antitermination 8 3 

David I. Friedman and Donald L. Court 

10. Phage Lysis 104 

Ry Young and Ing-Nang Wang 


PART III. Cubic and Filamentous Phages 

11. <J>X174 et al., the Microviridae 129 

Bentley A. Fane, Karie L. Brentlinger, 

April D. Burch, Min Chen, Susan Hafenstein, 

Erica Moore, Christopher R. Novak, and 
Asako Uchiyama 

12. Filamentous Phage 146 

Marjorie Russel and Peter Model 

13. PRD1: Dissecting the Genome, Structure, 

and Entry 161 

A. Marika Grahn, Sarah J. Butcher. Jaana K. H. 
Bamford, and Dennis H. Bamford 


14. Lipid-Containing Bacteriophage 
PM2, the Type Organism of 

Corticoviridae 171 

Dennis H. Bamford and 
Jaana K. H. Bamford 

15. Single-Stranded RNA Phages 175 

Jan van Duin and Nina Tsareva 

16. Phages with Segmented Double-Stranded 

RNA Genomes 197 

Leonard Mindich 


PART IV. Individual Tailed Phages 

17. The Tl-Like Bacteriophages 211 

Gregory J. German, Rajeev Misra, and 
Andrew M. Kropinski 

18. T4 and Related Phages: Structure and 

Development 225 

Gisela Mosig and Fred Eiserling 

19. Bacteriophage T 5 268 

Jon R. Sayers 

20. The T 7 Group 277 

Ian J. Molineux 

21. Bacteriophage N4 302 

Krystyna M. Kazmierczak and 
Lucia B. Rothman-Denes 

22. Phage (j)29 and its Relatives 315 

Margarita Salas 

23. Bacteriophage SPP1 331 

Juan C. Alonso, Paulo Tavares, Rudi Lurz, 
and Thomas A. Trautner 

24. Bacteriophage PI 350 

Hansjorg Lehnherr 

25. The P2-Like Bacteriophages 365 

Anders S. Nilsson and 
Elisabeth Haggard Ljungquist 

26. The Satellite Phage P4 391 

Gianni Deho and Daniela Ghisotti 

27. Bacteriophage X and its Genetic 

Neighborhood 409 

Roger W. Hendrix and Sherwood Casjens 

28. N15: The Linear Plasmid Prophage 448 

Nikolai V Ravin 



X 


CONTENTS 


29. 

Bacteriophage P22 

457 

39. 

Molecular Genetics of Streptomyces Phages 

621 


Peter E. Prevelige, Jr. 



Margaret C. M. Smith 


30. 

The Bacteriophage Mu 

469 

40. 

Mycoplasma Phages 

636 


Luciano Paolozzi and Patrizia Ghelardini 



Jack Maniloff and Kevin Dybvig 





41. 

Lactobacillus Phages 

653 

PART V. Phages by Host or Habitat 



Harald Briissow and Juan E. Suarez 


31. 

Viruses of Archaea 

499 





Kenneth M. Stedman, David Prangishvili, 


PART VI. Applications 



and Wolfram Zillig 


42. 

Control of Bacteriophage in Commercial 


32. 

Phages of Cyanobacteria 

517 


Microbiology and Fermentation Facilities 

667 


Nicholas H. Mann 



Gregg Bogosian 


33. 

Marine Phages 

534 

43. 

Phage-Based Expression Systems 

674 


Robert V Miller 



Noreen E. Murray 


34. 

Yersinia Phages 

545 

44. 

Phage in Display 

686 


Stefan Hertwig, Mikael Skurnik, and 



Bjorn H. Lindqvist 



Bernd Appel 


45. 

Bacteriophage as Pollution Indicators 

695 

35. 

Temperate Bacteriophages of 



Charles P. Gerba 



Bacillus subtilis 

557 

46. 

The Use of Phage as Diagnostic Systems 

702 


Pamela S. Fink and Stanley A. Zahler 



Cath Rees 


36. 

Phages of Lactococcus lactis 

572 

47. 

Bacteriophages in Bacterial Pathogenesis 

710 


Lone Brondsted and Karin Hammer 



Patrick L. Wagner and Matthew K. Waldor 


37. 

The Listeria Bacteriophages 

593 

48. 

Phage Therapy 

725 


Martin J. Loessner and Richard Calendar 



Carl R. Merril, Dean Scholl, and Sankar Adhya 


38. 

Mycobacteriophages 

602 





Graham F. Hatfull 


Index 

743 


Contributors 


Stephen T. Abedon, Department of Microbiology, Ohio State 
University, Mansfield, OH 44906, USA 

Hans-W. Ackermann, Department of Medical Biology, 
Faculty of Medicine, Laval University, Qc GlK7p4, 
Quebec, Canada 

Sankar Adhya, Laboratory of Molecular Biology, Center for 
Cancer Research, National Cancer Institute, National 
Institutes of Health, Bethesda, MD 20892, USA 

Juan C. Alonso, Departmento de Biotechnologia Microbiana, 
Centro Nacional de Biotecnologia-Consejo Superior 
de Investigaciones Cientificas, Campus Universidad 
Autonoma de Madrid, Cantoblanco, 28049 Madrid, Spain 

Dwight L. Anderson, University of Minnesota, 18 256 Moos 
Tower, 515 Delaware Street S E, Minneapolis, MN 55455, 
USA 

Bernd Appel, Department of Biological Safety, Horizontal 
Gene Transfer, Robert Koch Institut, Nordufer 20, 13353 
Berlin, Germany 

Dennis H. Bamford, Department of Biosciences and Institute 
of Biotechnology, Biocenter 2, FIN-00014, University of 
Helsinki, Finland 

Jaana K. H. Bamford, Department of Biosciences and Insti¬ 
tute of Biotechnology, Biocenter 2, FIN-00014, University 
of Helsinki, Finland 

Gregg Bogosian, Monsanto Company, 800 N. Lindbergh 
Blvd, Creve Couer, MO 63167, USA 

Karie L. Brentlinger, Department of Veterinary Science and 
Microbiology, University of Arizona, Building 90, Tucson, 
AZ 85721, USA 

Lone Brondsted, Department of Veterinary Pathobiology, 
Section for Microbial Food Safety, The Royal Veteri¬ 
nary and Agriculture University, Stigbojlen 4, DK-1870 
Frederiksberg C, Denmark 

Harald Briissow, Nestle Research Center, 1000 Lausanne 26, 
Vers-chez-les-Blanc, Switzerland 

April D. Burch, Department of Veterinary Science and 
Microbiology, University of Arizona, Building 90, Tucson, 
AZ 85721, USA 

Sarah J. Butcher, Department of Biosciences and Institute 
of Biotechnology, Biocenter 2, FIN-00014, University of 
Helsinki, Finland 

Richard Calendar, Department of Molecular and Cell Biology, 
University of California, Berkeley, CA 94720-3202, USA 


Allan Campbell, Department of Biological Sciences, Stanford 
University, Stanford, CA 94305, USA 
Sherwood Casjens, Department of Pathology, University of 
Utah Medical Center, 30 North 1900 East, Salt Lake City, 
UT 84132, USA 

Min Chen, Department of Veterinary Science and Micro¬ 
biology, University of Arizona, Building 90, Tucson, AZ 
85721, USA 

Donald L. Court, Molecular Control and Genetics Section, 
Gene Regulation and Chromosome Biology Laboratory, 
Center for Cancer Research, National Cancer Institute, 
Building 539, P.O. Box B, Frederick, MD 21702-1201, USA 
Gianni Deho, Dipartimento di Scienze Biomolecolari e 
Biotecnologie, Universita degli Studi di Milano, Via Celoria 
26, 20133 Milan, Italy 

Frank Desiere, Nestle Reserch Center, 1000 Lausanne 26, 
Vers-chez-les-Blanc, Switzerland 
Kevin Dybvig, Department of Genetics, University of 
Alabama at Birmingham, 720 South 20th Street, Kaul, 
Room 720, Birmingham, AL 35294, USA 
Fred Eiserling, Department of Microbiology, Immunology 
and Molecular Genetics, University of California, Los 
Angles, Los Angeles, CA 90095, USA 
Bentley A. Fane, Department of Veterinary Science and 
Microbiology, University of Arizona, Building 90,Tuscon, 
AZ 85721, USA 

Pamela S. Fink, Bioprocess Development, Wyeth Research, 
Pearl River, NY 10965, USA 

David I. Friedman, Department of Microbiology and Immu¬ 
nology, The University of Michigan, Medical School, 
Ann Arbor, MI 48109-0620, USA 
Charles P. Gerba, Department of Soil, Water and Environ¬ 
mental Science, University of Arizona, Tucson, AZ 85721, 
USA 

Gregory J. German, Faculty of Cellular and Molecular 
Biosciences, School of Life Sciences, Arizona State 
University, Tempe, AZ 85287-4501, USA 
Patrizia Ghelardini, Istituto di Biologia e Patologia 
Molecolari del CNR, c/o Universita di Roma “La Sapienza”, 
Pie Aldo Moro 5,00185 Rome, Italy 
Daniela Ghisotti, Dipartimento di Scienze Biomolecolari e 
Biotecnologie, Universita degli Studi di Milano, Via 
Celoria 26, 20133 Milan, Italy 



XII CONTRIBUTORS 


A. Marika Grahn, Department of Biosciences and Institute 
of Biotechnology, Biocenter 2, 00014, University of 
Helsinki, Finland 

Susan Hafenstein, Department of Veterinary Science and 
Microbiology, University of Arizona, Building 90, Tucson, 
AZ 85721, USA 

Elisabeth Haggard-Ljungquist, Department of Genetics, 
Microbiology and Toxicology, University of Stockholm, 
S-106 91 Stockholm, Sweden 

Karin Hammer, Microbial Physiology and Genetics, 
BioCentrum-DTU, Technical University of Denmark, 
DK-2800 Kgs. Lyngby, Denmark 
Graham F. Hatfull, Department of Biological Sciences, 
University of Pittsburgh, Pittsburgh, PA 15260, USA 
Roger W. Hendrix, Department of Biological Sciences, 
University of Pittsburgh, Pittsburgh, PA 15260, USA 
Stefan Hertwig, Department of Biological Safety, Horizontal 
Gene Transfer, Robert Koch Institut, Nordufer 20, 13353 
Berlin, Germany 

Paul J. Jardine, University of Minnesota, 18 256 Moos Tower, 
515 Delaware Street S E, Minneapolis, MN 55455, USA 
Krystina M. Kazmierczak, Department of Biology, Indiana 
University, Bloomington, IN 47405, USA 
Andrew M. Kropinski, Department of Microbiology and 
Immunology, Queens University, Kingston, Ontario K7L 
3N6, Canada 

Hansjorg Lehnherr, Department of Genetics and Biochemis¬ 
try, Institute of Microbiology, Ernst Moritz Arndt Univer¬ 
sity Greifswald, 17487 Greifswald, Germany 
Bjorn H. Lindqvist, Department of Molecular Biosciences, 
University of Oslo, 0315 Oslo, Norway 
John W. Little, Department of Biochemistry and Molecular 
Biophysics, University of Arizona, Tucson, AZ 85721, USA 
Martin J. Loessner, Institute of Food Science and Nutri¬ 
tion, Swiss Federal Institute of Technology (ETH), 
Schmelzbergstrasse 7, LFV B20 CH-8092 Zurich, 
Switzerland 

Rudi Lurz, Max-Planck-Institut fiir Molekulare Genetik, 
Ihnestrasse 73, D-1495 Berlin, Germany 
Jack Maniloff, Department of Microbiology and Immunology, 
University of Rochester, New York, NY 14642, USA 
Nicholas H. Mann, Department of Biological Sciences, 
University of Warwick, Coventry CV4 7AL, UK 
Carl R. Merril, Section on Biochemical Genetics, National 
Institute of Mental Health, National Institutes of Health, 
Bethesda, MD 20892, USA 

Robert V Miller, Department of Microbiology and Mole¬ 
cular Genetics, Oklahoma State University, Stillwater, OK 
74078, USA 

Leonard Mindich, Department of Microbiology, The Public 
Health Research Institute, Newark, NJ 07103, USA 
Rajeev Misra, Faculty of Cellular and Molecular Biosciences, 
School of Life Sciences, Arizona State University, Tempe, 
AZ 85287-4501, USA 


Peter Model, Laboratory of Genetics, The Rockefeller 
University, New York, NY 10021, USA 
Ian J. Molineux, Department of Molecular Genetics and 
Microbiology, University of Texas, Austin, TX 78712-1095, 
USA 

Erica Moore, Department of Veterinary Science and Micro¬ 
biology, University of Arizona, Building 90, Tucson, AZ 
85721, USA 

Gisela Mosig, Formerly Department of Molecular Biology, 
Vanderbilt University, Nashville, TN 37235, USA 
Noreen E. Murray, Institute of Cell and Molecular Biology, 
University of Edinburgh, Edinburgh EH9 3JR, UK 
Anders S. Nilsson, Department of Genetics, Microbiology 
and Toxicology, University of Stockholm, S-106 91 
Stockholm, Sweden 

Christopher R. Novak, Department of Veterniary Science 
and Microbiology, University of Arizona, Building 90, 
Tucson, AZ 85721, USA 

Luciano Paolozzi, Dipartimento di Biologia, Universita di 
Roma “Tor Vergata,” Via della Ricerca Scientifica, 00133 
Rome, Italy 

David Prangishvili, Institut Pasteur, 25-28 rue de Dr. Roux, 
75724 Paris Cedex 15, France 

Peter E. Prevelige, Jr. Department of Microbiology, University 
of Alabama at Birmingham, 845 19th Street South, BBRB 
414, Birmingham, AL 35294-2170, USA 
Nikolai V Ravin, Center “Bioengineering”, Russian Academy 
of Sciences, Prosp. 60-let Oktiabria, bld.7-1, Moscow 
117312, Russia 

Cath Rees, School of Biosciences, University of Nottingham, 
Sutton Bonington Campus, Loughborough, Leicester¬ 
shire LE12 5RD, UK 

Lucia B. Rothman-Denes, Department of Molecular Genetics 
and Cell Biology, University of Chicago, Chicago, IL 60637, 
USA 

Marjorie Russel, Laboratory of Genetics, The Rockefeller 
University, New York, NY 10021, USA 
Margarita Salas, Institute de Biologia Molecular Eladio 
Vinuela (CSIC), Centro de Biologia Molecular Severe 
Ochoa (CSIC-UAM), Universidad Autonoma de Madrid, 
Cantobalanco, 28049 Madrid, Spain 
Jon R. Sayers, Division of Genomic Medicine, Infection and 
Immunity, Medical School, University of Sheffield, 
Sheffield S102RX, UK 

Dean Scholl, Section on Biochemical Genetics, National 
Institute of Mental Health, National Institutes of Health 
Bethesda, MD 20892, USA 

Mikael Skurnik, Department of Bacteriology and Immunol¬ 
ogy, Haartman Institute, P.O. Box 63, 00014 Helsinki, 
Finland 

Margaret C. M. Smith, University of Aberdeen, Institute of 
Medical Sciences, Aberdeen AB25 2ZD, UK 
Kenneth M. Stedman, Biology Department, Portland State 
University, P.O. Box 751, Portland, OR 97207, USA 



CONTRIBUTORS XIII 


Juan E. Suarez, Area de Microbiologia, Universidad de 
Oviedo, Julian Claveria s/n, 33006 Oviedo, Spain 
William C. Summers, Department of Therapeutic Radiology 
and Molecular Biophysics and Biochemistry, Yale 
University, 266 Whitney Avenue, New Haven, CT 06520- 
8114, USA 

Paulo Tavares, Unite de Virologie Moleculaire et Structurale, 
UMR CNRS 2472-INRA 1157, Batiment 14B, CNRS, 
Avenue de laTerrasse, 91198 Gif-sur-Yvette Cedex, France 
Thomas A. Trautner, Max-Planck Institut fur Molekulare 
Genetik, Ihnestrasse 73,14195 Berlin, Germany 
Nina Tsareva, Depertment of Biochemistry, University of 
Leiden, Leiden, The Netherlands 
Asako Uchiyama, Department of Veterinary Sciences and 
Microbiology, University of Arizona, Tucson, AZ 85721, 
USA 

Jan van Duin, Department of Biochemistry, LIC, Leiden 
University, P.0. Box 9502, 2300 RA, Leiden, The 
Netherlands 


Parteick L. Wagner, Department of Microbiology, Tufts 
University School of Medicine and Howard Hughes 
Medical Institute, 136 Harrison Avenue, Boston, MA 
02111, USA 

Matthew K. Waldor, Department of Microbiology, Tufts 
University School of Medicine and Howard Hughes 
Medical Institute, 136 Harrison Avenue, Boston MA 
02111, USA 

Ing-Nang Wang, Department of Biological Sciences, 
University at Albany, State University of New York, 1400 
Washington Avenue, Albany, NY 12222, USA 

Ryland E Young III, Department of Biochemistry and 
Biophysics, Texas A&M University, 2128 TAMU, College 
Station,TX 77843-2128, USA 

Stanley A. Zahler, Department of Molecular Biology 
and Genetics, Cornell University, Ithaca, NY 14853, USA 

Wolfram Zillig, Formerly Max-Planck-Institut fur 
Biochemie, Am Klopferspitz 18A, 83152 Martinsried, 
Germany 



This page intentionally left blank 



PART I 


GENERAL BACKGROUND 
OF PHAGE BIOLOGY 



This page intentionally left blank 



1 


Phage and the Early Development of 
Molecular Biology 

WILLIAM C. SUMMERS 


M olecular biology evolved from multiple origins includ¬ 
ing the antivitalist biology of the early twentieth 
century which attempted to explain complex biological 
phenomena in terms of physical and chemical phenomena 
(16, 21), the Rockefeller Foundation program, led by physi¬ 
cists and mathematicians who believed the “human 
sciences" were ripe for deeper understanding based on 
chemistry and physics (29), research by physicists and 
chemists who saw life as a challenge to their burgeoning 
understanding of the fundamental structure of matter 
(2, c22), and work by a few microbiologists and geneticists 
who sought better understanding of the natures of genes 
and microbes (23). 

Although on one hand profoundly reductionistic, 
molecular biology eventually drew its boundaries to encom¬ 
pass organizational principles of cells and even organisms. 
These research pathways took two distinct routes start¬ 
ing in the 1930s. One route employed the new macro- 
molecular chemistry of proteins pioneered by Svedberg, 
Pauling and like-minded chemists, and X-ray diffraction 
analyses being developed by the mineralogists and physi¬ 
cists such as the Braggs, father and son, and Astbury and 
colleagues, mainly in Great Britain. Another route was 
taken by biologists and some curious physicists who 
approached complex biological problems with a combina¬ 
tion of reductionism and holism. 

The latter program was reductionistic because it endea¬ 
vored to find “simple systems” as exemplars of all of biology, 
but holistic because it treated the entire biological system 
(e.g., a bacterial cell or virus) as a “black box” with its 
intrinsic complexities. An article of faith in this endeavor, 
however, was that the observable behavior of the black box 
reflected some of the simple, knowable processes that 
occurred inside. The goal of this research program was to 
design experiments that connected empirical observations 
to fundamental cellular processes. 


Heredity and “the Gene” 

A crucial biological problem recognized by some scientists in 
the 1930s involved the mechanisms of heredity. Genetic 
analysis of model organisms such as insects, especially fruit 
flies, and plants such as barley and maize suggested that “the 
gene” as a concept was far from clear. On one hand, genes 
were transmitted from generation to generation with 
remarkable fidelity, and on the other hand, genes were impli¬ 
cated in the development of organisms and cells. The 
unusual stability of genes, however, contrasted with their 
rare but dramatic mutation to other forms which were 
(usually) highly stable. These linked puzzles of faithful trans¬ 
mission and rare but stable mutation suggested that genes 
were objects with unusual properties. Prior to the current 
understanding of genes as macromolecular entities with 
information-containing structure, genes were conceived 
as “factors,” “determinants,” or simply conceptual entities 
without physical reality at all (9,11). 

While genes were believed to be associated with the visi¬ 
ble cytological structures called chromosomes, the physical 
nature of genes was unclear. The concept of the gene as some 
sort of “unit of hereditary information” was not in the think¬ 
ing of biologists prior to the late 1950s (12). In the 1920s and 
1930s, the diversity of proteins, deduced from the amino acid 
compositions and sizes of proteins from different sources, 
suggested that proteins might be related to the remarkable 
diversity and specificity attributed to genes. 

Bacteriophages and Genes 

Almost from their first discovery in 1917, bacteriophages 
were viewed as relevant to research on these studies on the 
nature of the gene. Because of their small size (and presumed 
simple structure) and the claim that they multiplied to yield 
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progeny similar to the parental phage, bacteriophages were 
viewed by some investigators as the most primitive form of 
life and, perhaps, “naked genes” (20). Although it seems 
likely that bacteriologists had encountered bacteriophages 
from the beginning of their discipline, it was in 1915 that 
F. W. Twort reported on a strange phenomenon he termed 
“glassy transformation” of cultures of micrococci (27). He 
noted that the calf lymph, from which he was trying to 
grow vaccinia on cell-free medium, contained a serially 
transmissible agent that induced a watery dissolution of 
the bacteria, leaving nothing but subcellular granules. 
Quite independently, and for quite different reasons, Felix 
d'Herelle in 1917 discovered a microbe that was “antagonis¬ 
tic” to bacteria and that resulted in their lysis in liquid 
culture and killing in discrete patches (he called them 
plaques) on the surface of agar seeded with the bacteria. 
d'Herelle conceived of these invisible microbes as “ultra¬ 
viruses" that invaded bacteria and multiplied at their 
expense, and so he termed them bacteriophage (7). He 
astutely realized that the plaque count provided a way to 
enumerate these invisible agents, which he conceived as 
particulate. He was able to show that phage multiplied in 
“waves" or “steps” representing cycles of infection, multipli¬ 
cation, release, and reinfection. 

d'Herelle pioneered two important pathways of phage 
research: one was based on his finding that phage titers 
rose in patients with infectious diseases just as recovery 
was taking place. From these observations he believed 
that phages represented a natural agent of resistance to 
infectious diseases, and went on to advocate phages as ther¬ 
apeutic agents in the pre-antibiotic era. His concept of 
phages as viruses of bacteria was not widely accepted (the 
leading authorities up until the early 1940s thought that 
the lytic phenomena associated with bacteriophages 
resulted from an autocatalytic activation of an induced 
endogenous lytic enzyme). To counter his critics as well as 
to establish his priority for the discovery of phage in a long- 
running dispute with Twort’s supporters, d’Herelle’s second 
research program examined the biological nature of bacter¬ 
iophage. All his evidence pointed to the conception of phages 
as organized infectious agents that are obligate intracellular 
parasites. The antigenic properties as well as host ranges 
of phages appeared to be characteristic of given “races” 
of phages, and gave hints that phage genetics might be a 
fruitful topic for study (8). 

Genes, Phage, and Radiation 

For most of the first half of the twentieth century the chemi¬ 
cal structures and functions of most cellular components 
were inaccessible to direct experimentation. During this 
period, however, quite ingenious use was made of a tech¬ 
nique that was appropriated directly from the field of 
atomic physics. Just as the bombardment of atomic nuclei 


by high-energy particles such as X-rays and electrons 
provided information about the internal structure of the 
atom, by the mid-1920s physicists had developed conceptual 
approaches to probe the interior of cells by bombardment 
with projectiles such as X-rays and other high-energy parti¬ 
cles. From a simple X-ray inactivation curve, without any 
chemical or structural knowledge, the size and shape of an 
enzyme or virus could be deduced. When it was shown by 
Herman J. Muller in 1922 that X-rays could induce mutations 
in genes, the target theory approach was immediately 
applied to genes to determine the basic size and shape of “a 
gene.” Because this approach was well known to physicists 
and employed straightforward mathematical formalisms, 
it is not surprising that many physicists who were interested 
in biological problems used radiobiological methods in their 
research. 

For example, one of the century’s greatest physicists, 
who thought seriously about biological problems, was Niels 
Bohr. His essay, “Light and Life," published in 1933 (2), was 
said to have been highly influential on the thinking of 
many physicists. Leo Szilard, the inventor of the nuclear 
chain-reaction, became involved in full-time radiobiological 
research after World War 2 when he decided to take up biolo¬ 
gical problems after his wartime work on nuclear physics. In 
France, Marie Curie undertook biological studies of radia¬ 
tion based on target theory models, and Fernand Holweck, 
another brilliant physicist of broad interests, collaborated 
with biologists at the Institut Pasteur, including Salvador 
Luria, a medically trained physicist working first in Rome, 
then in Paris, and finally in the United States, to study 
virus structure and function in the late 1930s using formal 
target theory. 

The application of radiobiology to study the nature of the 
gene was the main program of a small group in pre-war 
Germany consisting of Max Delbriick, a physicist, Karl 
Zimmer, a radiobiologist, and Nicolai Timofeeff-Ressovsky, a 
visiting Russian geneticist (26). These scientists published 
what Gunther Stent has characterized as nothing less than 
an attempt at a quantum mechanical description of the 
gene (25). Their experimental approach was, again, almost 
exclusively based on target theory. 

What started as a problem in microbiology was adopted 
and adapted to the research program of the radiobiologists 
and geneticists (24). Salvador Luria studied the sizes of 
phages using target theory models. Delbriick, a student 
of Meitner and Hahn, moved to the California Institute 
of Technology in 1938 with plans to develop his theories of 
the gene using fruit flies, but upon arriving at Caltech he 
learned of phage research already being carried out there 
by Emory Ellis. Ellis had initiated a project to study the 
basic biology of phage as a model for oncogenic virus 
research. Delbriick was impressed by Ellis’s confirmation 
of d’Herelle’s one-step growth curves and so he joined in 
the study of bacteriophage multiplication as a black box 
model (or “gadget” in Delbriick’s terminology) for heredity. 
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Concurrently with the work of Ellis and Delbriick and that 
of Luria, Alfred Hershey was studying phages from the 
point of view of their physiology. He was collaborating 
with Jacques Bronfenbrenner at Washington University in 
St. Louis, Missouri who had a long interest in the possible 
metabolic and structural organization of bacteriophage. 

Delbriick was a natural organizer and, together with 
Luria and Hershey, he began to recruit people to work on 
phage biology. He developed a group of proteges, followers, 
devotees, and students who adopted his viewpoints on the 
important problems of phage research and the legitimate 
ways to approach these problems. Delbriick saw the value 
in focusing research on a small group of phages so that 
results from different laboratories could be compared, and 
so he selected a group of “authorized phages” which became 
designated theT-phages,Tl-T7 (T for type) (6). 

Gene Regulation and Lysogenic Phages 

While Delbriick was suspicious of the phenomenon 
known as lysogeny, this problem was directly attacked 
by Andre Lwoff and his colleagues in Paris in the 1950s. 
Lysogenic phenomena were observed almost from the first 
days of phage research, but the nature of the relationship 
between phage and host was unclear. Was lysogeny a sort of 
smoldering, persistent infection with the phage muliplying 
in some sort steady state with the growth of the host, or 
did the phage become truly latent? Eugene and Elizabeth 
Wollman suggested that phage in the lysogenic state 
behaved as part of the cellular hereditary apparatus. Lwoff 
and Antoinette Gutmann finally clarified the nature of lyso¬ 
geny and christened the latent form of the phage “prophage” 
in their work in 1950, in which they followed phage induc¬ 
tion and release from single cells using direct microscopic 
observation and sampling with a micromanipulator (18). 
Lysogeny, induction, and its regulation became a major 
focus of phage research in Paris in the 1950s. The thesis of 
Francis Jacob was on lysogeny in Pseudomonas pijocyanea, 
and in the 1950s the study of lysogeny provided the ground¬ 
work for the operon concept of gene regulation that was 
developed in the Service de Physiologie Microbienne at the 
Institut Pasteur (19). 

Just as the T-phages were the model organisms for lytic 
phage research, bacteriophage X, discovered in 1951 by 
Esther Lederberg, became the prototypic lysogenic phage 
(13). Study of X phage has provided a deep understanding 
of regulation of gene expression on one hand, and the 
mechanisms of lysogeny on the other. 

While the French phage workers pursued research of a 
more physiological sort, based as it was on Lwoff’s life long 
interest in growth, nutrition, and physiological adaptations, 
Delbriick’s followers, who came to be called the American 
Phage Group, favored more direct physical approaches, one 
of which was the target theory method. 


Phage and the Physical Gene 

The nature and cause of gene mutations were investigated in 
the 1930s, and a key question emerged from this work: Were 
mutations caused by the selective growth conditions (e.g., 
the addition of lactose to the medium) or did the mutations 
occur randomly all the time and their existence become 
known by imposition of the selective growth conditions? 
The outcome of this research would have profound implica¬ 
tions not only for the science of genetics but also for a deeper 
understanding of evolutionary biology in general. 

The major problem in this work was one of experimental 
design: how to observe a rare event that happened in a huge 
population prior to the selection for the outcome of that 
event. In a particularly clear and convincing work, Isaac 
M. Lewis examined the mutational change from the inability 
to ferment lactose to the ability to use this sugar source in 
the Escherischia coli strain mutabile. He concluded that 
the mutations to lactose utilization occurred prior to the 
selection, not as a consequence of exposure to the selective 
conditions. His elegant and clear approach, however, did 
not change many minds, but in the 1940s two related 
experimental approaches gave results believed to settle this 
question. Both of these employed bacteriophage as experi¬ 
mental tools. Delbriick and Luria knew from the work of 
d’Herelle that bacteria often developed heritable resistance 
to phage. Their routine use of statistical models in their 
target theory, as well as their backgrounds in atomic physics, 
helped them to devise a statistical approach (“fluctuation 
test”) to show that phage-resistant mutants existed in the 
bacterial population prior to exposure to the lethal effects of 
phage (17). The method of Luria and Delbriick, and a related 
procedure devised in 1949 by Howard B. Newcome, were 
indirect and mathematical. However, in 1952 Joshua and 
Esther Lederberg devised a simple and direct way to demon¬ 
strate that mutations were occurring in a random way, inde¬ 
pendent of the selection procedures: they transferred very 
large numbers of colonies from one plate to another by the 
use of velvet cloth as a transfer tool (15). Thus, “replica 
plates” could be used to test colonies in great numbers for 
mutant properties. They applied this technique to study 
of phage resistance as well as streptomycin resistance: 
again, it was clear that the mutants had appeared before 
the application of the selective agent. 

The random nature of mutation and its very low 
frequency of occurrence suggested that it might be similar 
to, or governed by, a quantized, two-state process. This 
model appealed to physicists such as Delbriick, Erwin 
Schroedinger and Bohr who thought deeper understand¬ 
ing of this paradoxical behavior might reveal new physical 
laws of nature (25). Remarkably, this deeper insight 
was provided by strictly formal genetic analysis of phage 
mutations. Hershey had employed plaque morphology 
mutants (large plaques, interpreted as rapid-lysis mutants 
or “r-mutants”) to show that phages could be “mated" or 
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“crossed" by simultaneous mixed infections and thus it was 
possible to carry out formal genetic analysis on phage just as 
with sexually mating organisms (10). Seymour Benzer found 
one class of Hershey’s r-mutants of phages T2 and T4 to be 
particularly interesting: they did not give any plaques at 
all on bacterial hosts carrying the X prophage (K-12 X) but 
gave the large (r-mutant) plaques on the usual bacterial 
host (strain B). Benzer exploited this case of conditional 
expression of a phage mutation to develop the fine structure 
genetic analysis of the T4rII gene (1). Because of the strong 
specificity of the system (the discrimination against the 
rll mutants in host K-12 X is greater than 10 8 ), very low 
frequency wild-type r + recombinants or revertants can be 
detected, and hence even recombination between very 
close mutations can be observed. Benzer calculated that 
he could detect recombination between adjacent base pairs 
in the rll gene. The detailed genetic analysis of deletions 
and insertion mutations in the T4rII gene also provided 
strong evidence for the triplet nature of the genetic code in 
phage crosses by Francis H. C. Crick, Leslie Barnett, Sydney 
Brenner, and Richard J.Watts-Tobin (5). 

By the 1960s it became clear that phages represented 
only one sort of extrachromosomal genetic structure that 
could exist in bacteria. The fertility factor F, transmissible 
drug-resistant determinants, as well as prophages all 
represented examples of what Joshua Lederberg termed 
“plasmids” (14). It remained to be determined in what form 
these plasmids existed in the cell. Some experiments 
suggested that the plasmids (the F-factor in Hfr strains, and 
some prophages) were attached to the cell chromosome 
somehow. A very fruitful model for how this attachment 
could occur was proposed in 1962 by Allan Campbell (3). 
He reasoned that because the genetic map of phage T4 
was circular while the phage DNA appeared to be linear, 
the phage might assume a circular intracellular form. This 
circular form could, with a single reciprocal recombination 
event, become linearly integrated into the chromosomal 
DNA. Although it turned out that T4 has a circular genetic 
map for reasons other than forming a physically circular 
model, Campbell’s model was soon confirmed for phage X, 
and eventually many other plasmids. It has provided the 
conceptual basis for retrovirus integration and excision 
in animal cells as well. 


Genes as Information 

As the study of phage replication and gene expression devel¬ 
oped, it became clear that the full understanding of the 
workings of genes could be seen as a unified process, which 
was captured by Crick’s metaphor of the so-called central 
dogma of molecular biology. Describing the function 
of genes in information-theoretic terms, he stated that infor¬ 
mation flows from DNA to DNA and from DNA to RNA 
and thence to proteins. He also noted at the time that it was 


theoretically possible for information flow to occur from 
RNA to DNA and RNA to RNA (4). Key experimental support 
for the role of RNA as an intermediary in such information 
transfer was the finding that in phage-infected cells, the 
base composition of newly synthesized RNA was much 
more like that of the phage DNA than the host DNA (28). 
Thus, it was concluded that phage gene function required 
the synthesis of RNA molecules that differed from those 
present in the uninfected cell, and that the RNA base 
composition, and hence its information content, was deter¬ 
mined by the DNA of the phage. 

From the 1960s, phage research was greatly advanced by 
new biochemical approaches originating from the early 
work of Seymour Cohen, Lloyd Kozloff, and others. The avail¬ 
ability of radioactive isotopic techniques, improved methods 
of protein chemistry, and advances in enzymology contribu¬ 
ted to the influx of biochemists and biochemical methods in 
phage work. This fruitful collaboration between biochemists, 
geneticists, and microbiologists, led to detailed descriptions 
of the mechanisms of phage replication and transcription, 
and of phage morphogenesis and assembly, and the detailed 
understanding of phage adsorption and entry phenomena. 
Many of these advances are still ongoing and are part of the 
recent history of phage molecular biology described in this 
book. 
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Classification of Bacteriophages 

HANS-W. ACKERMANN 


B acteriophages or “phages” were discovered twice in a 
short time. The British pathologist Frederick William 
Twort described in 1915 the glassy transformation of “Micro¬ 
coccus” colonies by an unknown agent. Felix Hubert d’Her- 
elle, a French Canadian then working at the Pasteur 
Institute of Paris, observed the destruction of Shigella 
bacteria in broth (5). Contrary to Twort, he clearly recog¬ 
nized the viral nature of this phenomenon and devoted the 
rest of his scientific career to it. He coined the term “bacter¬ 
iophage,” devised several techniques still in use, and intro¬ 
duced the phage treatment of infectious diseases. For him, 
there was only one bacteriophage species with many races: 
the Bacteriophagum intestinale (6). 

The viral nature of bacteriophages was recognized in 
1940 with the advent of the electron microscope. In 1962, 
Lwoff, Horne, and Tournier published a system of viruses 
based on morphology and nucleic acid type. They proposed 
the order Urovirales for tailed phages and the families Inovir- 
idae and Microviridae for filamentous and <f>X-type phages, 
respectively (9). A further milestone was the recognition 
of six basic phage types: tailed phages, filamentous 
phages, and icosahedral phages with single-stranded DNA 
or single-stranded RNA. This simple scheme, proposed 
by Bradley in 1967 (4), is still the basis of the present edifice 
of phage classification. 

In 1971, the International Committee on Taxonomy of 
Viruses (ICTV) classified phages into six genera correspond¬ 
ing to five of Bradleys basic types, namely theT4, X, (j>X174, 
MS2, and fd phage groups, and the newly described type 
PM2 (16). New phage groups were added over time. The 
most recent development is the establishment of the order 
Caudavirales for tailed phages and of 15 tailed phage genera 
(1, 10, 14). At present, at least 5136 bacterial viruses have 
been examined in the electron microscope (3). This makes 
bacteriophages the largest viral group in nature. 

Phage Classification Today 

The ICTV presently recognizes one order, 13 families, 
and 31 genera of phages (14). Virions have binary, cubic, or 


helical symmetry, or are pleomorphic. Most phages 
contain dsDNA, but there are small phage groups with 
single-stranded (ss) DNA, ssRNA, or double-stranded 
(ds) RNA. A few types have a lipid-containing envelope or 
contain lipids as part of the particle wall. Tailed phages 
(binary symmetry) total about 5100 viruses at the time of 
writing (96% of phages) and are classified into the order 
Caudovirales and three very large, phylogenetically related 
families. By contrast, “cubic,” filamentous, and pleomorphic 
phages comprise less than 190 viruses only (3.6% of phages) 
and are classified into 10 small families. They are extremely 
diversified by their basic properties and seem to consti¬ 
tute many different lines of descent. Capsids with cubic 
symmetry are icosahedra or related bodies. Particles 
are enveloped or not. The presence of lipids is accompanied 
by low buoyant density and high sensitivity to chloroform 
and ether. 

As elsewhere in virology, families are chiefly defined by 
nature of nucleic acid and particle morphology (figure 2-1; 
table 2-1). There are no universal criteria for genus and 
species delineation. The ICTV uses every available property 
for classification and has adopted the “polythetic species 
concept,” meaning that a species is defined by a set of 
properties, some of which may be absent in a given 
member (13). Taxonomic names of orders, families, and 
genera are typically constructed from Latin or Greek roots 
and end in -virales, -viridae, and -virus, respectively. Most 
taxa of “cubic,” filamentous, and pleomorphic phages have 
latinized names. So far, tailed phage genera have vernacular 
names only. 

Viruses with Binary Symmetry 

(Caudovirales, Tailed Phages) 

At least 4950 tailed phages are known (3). Particles consist of 
a head with cubic symmetry and a "helical” tail and are said 
to be of binary symmetry (9). The head-tail structure is 
unique in virology. Although tail-like structures occur in a 
few other viruses, such as tectiviruses (see below), some 
algal viruses, and polydnaviruses, they are inconstant and 
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Figure 2-1 Schematic representation of major phage groups. 
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Table 2-1 Classification and Basic Properties of Bacteriophages 


Symmetry 

Nucleic acid 

Order and families 

Genera 

Members 

Particulars 

Binary (tailed) 

DNA, ds, L 

Caudovirales 

15 

4950 




Myoviridae 

6 

1243 

Tail contractile 



Siphoviridae 

6 

3011 

Tail long, noncontractile 



Podoviridae 

3 

696 

Tail short 

Cubic 

DNA, ss, C 

Microviridae 

4 

40 



ds, C, T 

Corticoviridae 

1 

3? 

Complex capsid, lipids 


ds, L 

Tectiviridae 

1 

18 

Internal lipoprotein vesicle 


RNA, ss, L 

Leviviridae 

2 

39 



ds, L, S 

Cystoviridae 

1 

1 

Envelope, lipids 

Helical 

DNA, ss, C 

Inoviridae 

2 

57 

Filaments or rods 


ds, L 

Lipothrixviridae 

1 

6? 

Envelope, lipids 


ds, L 

Rudiviridae 

1 

2 

Resembles TMV 

Pleomorphic 

DNA, ds, C, T 

Plasmaviridae 

1 

6 

Envelope, lipids, no capsid 


ds, C, T 

Fuselloviridae 

1 

8? 

Spindle-shaped, no capsid 


Modified from Ackermann, H.-W. 2001. Le matin des bacteriophages. Virologie 5:35-43. With permission of John Libbey-Eurotext. Phage 
numbers are from (3). 

C, circular; L, linear; S, segmented; T, superhelical; ss, single-stranded; ds, double-stranded. 


do not compare with the regular and constant tails of tailed 
phages (1). This feature, and many morphological or physio¬ 
logical properties, indicate that tailed phages constitute a 
monophyletic evolutionary group. At the same time, tailed 
phages are extremely varied in DNA content and composi¬ 
tion, dimensions and fine structure, physiology, and life¬ 
style; for example, DNA sizes vary between 17 and 500 kb 
and tail lengths range from 10 to 800 nm. Tailed phages 
represent the most diversified of all virus groups. 


Virions have no envelope and consist typically of protein 
and DNA. Lipids are generally absent. Heads are icosahedra 
or elongated derivatives thereof. Isometric heads prevail 
(85%). Capsomers are rarely visible. Tails are helical or 
consist of stacked disks and carry in most cases fixation 
structures such as base plates, spikes, or terminal fibers. 
Heads, tails, and tail fibers are synthesized in separate path¬ 
ways and assembled later. Virion response to inactivating 
agents is variable and no generalization is possible here. 
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Table 2-2 Selected Properties of Major Phage Groups 





Lipids 

(%) 

Nucleic acid 

Infection 

Release 

Family or group 

Examples 

Capsid (nm) 

% 

MW 

Caudovirales 

T4, /., T7 

67 

~ 

46 

79 

V orT 

Lysis 

Range 

Microviridae 

4>X174 

30-160 

27 


30-62 

26 

17-498 
4.4-6.1 

V 

Lysis 

Corticoviridae 

PM2 

60 

13 

14 

9.0 

V 

Lysis 

Tectiviridae 

PRD1 

63 

15 

14 

15 

V 

Lysis 

Leviviridae 

MS2 

23 

- 

30 

3.5-4.3 

V 

Lysis 

Cystoviridae 

(j>6 

75-80 

20 

10 

13.4 

V 

Lysis 

inoviridae, Inovirus 

fd 

760-1950x 7 

- 

67-21 

5.8-7.3 

S orT 

Excretion 

Plectrovirus 

L51 

85-250 x 7 

- 


4.4-8.3 

P 

Excretion 

Lipothrixviridae 

TTV1 

400-2400 x 20-40 

22 

3 

16-42 

V orT 

Lysis 

Rudiviridae 

SIRV1 

780-900 x 23 

- 


33-36 

P 

Excretion 

Plasmaviridae 

L2 

80 

11 


11.7 

T 

Excretion 

Fuselloviridae 

SSV1 

85x55 

10 


15 

T 

Excretion 


Modified from Ackermann, H.-W. 2001. Le matin des bacteriophages. Virologie. 5:35-43. With permission of John Libbey-Eurotext. More physiological data may 
be found in (2). 

MW, molecular weight in kb or kpb x 10 6 ; P, permanent; S, steady-state, T, temperate; V, virulent; -, none. 


Despite the general absence of lipids, about one third of 
tailed phages are chloroform-sensitive. 

The DNA is a single, linear, double-stranded filament. Its 
composition generally reflects that of the host bacterium, 
but some DNAs contain unusal bases such as 5-hydroxy- 
methylcytosine or 5-hydroxymethyluracil. Genetic maps 
are complex and include about 290 genes in phage T4 
(possibly much more in larger phages). Genes for related 
functions cluster together. During maturation, the DNA 
enters preformed capsids. 

Tailed phage are divided into three families: 

1. Myoviridae, with contractile tails consisting of a 
sheath and a central tube (25% of tailed phages). The 
sheath is separated from the head by a neck. 

2. Siphoviridae, with long, noncontractile tails (61%). 

3. Podoviridae, with short tails (14%). 

Genera are differentiated by genome structure (presence 
or absence of cos or pac sites, terminal redundancies, and 
circular permutations), concatemer formation, presence or 
absence of unusual bases and DNA or RNA polymerase 
genes, and DNA sequence. Classification into genera is still 
at its beginnings (table 2-3). Many more tailed phage genera 
are likely to be recognized in the future. The few presently 
defined genera may be seen as islands in a sea of nonclassi- 
fied phages, or as crystallization points for phages awaiting 
classification. The term “X-like viruses” is not a synonym of 
“lambdoid phages." The latter is a vernacular term that 
denotes the presence of common genes. It may be recalled 
that A, and P22, here attributed to distinct genera, have no 
more than 13.5% DNA homology (12, 15). About 250 
species are presently recognizable, mostly on the basis of 
morphology, DNA-DNA hybridization and sequencing, and 
serology. 


Phages with Cubic Symmetry and DNA 

Microviridae (ssDNA) 

Virions are small (27 nm in diameter), have no envelope, 
and contain a single piece of circular ssDNA. Phages infect 
very different hosts (enterobacteria, Bdellovibrio, Chlamydia, 
Spiroplasma) and are classified into four genera. 

Corticoviridae (dsDNA) 

The only certain member of the family Corticoviridae is a 
maritime phage, PM2. Its capsid consists of two protein 
shells between which a lipid bilayer is sandwiched. Two 
similar viruses were isolated from seawater, but their taxo¬ 
nomic position is unclear. 

Tectiviridae (dsDNA) 

Phages are unique by their structure and mode of infection. 
A rigid outer protein capsid surrounds a thick, flexible 
lipoprotein vesicle. Upon adsorption to bacteria or chloro¬ 
form treatment, this vesicle becomes a taillike tube about 
60 nm in length, thus a nucleic acid ejection device. Tecti- 
viruses of bacilli have apical spikes. Despite their small 
numbers, tectiviruses occur in widely different hosts such 
as enterobacteria, bacilli, and Thermus bacteria. 

Phages with Cubic Symmetry and RNA 

Leviviridae (ssRNA) 

Virions of the Leviviridae resemble enteroviruses and 
have no particular morphological characteristics. Most 
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Table 2-3 Tailed Phage Genera 


Family 

Genus 

Type species 

Species 

Members 3 

Principal hosts 

Myoviridae 

T4-like viruses 

T4 

7 

47(4-100) 

Enterobacteria 


PI -like viruses 

PI 

3 

12 

Enterobacteria 


P2-like viruses 

P2 

2 

16 

Enterobacteria 


Mu-like virues 

Mu 

1 

2 

Enterobacteria 


SPOI-like viruses 

SPOI 

1 

13 

Bacillus 


4>H-like viruses 

<TH 

1 

2 

Halobacterium b 

Siphoviridae 

A.-like viruses 

X 

1 

7 

Enterobacteria 


T1 -like viruses 

T1 

1 

11 (+50) 

Enterobacteria 


T5-like viruses 

T5 

1 

5(+20) 

Enterobacteria 


L5-like viruses 

L5 

1 

4(+15) 

Mycobacterium 


c2-like viruses 

c2 

1 

5(+200) 

Lactococcus 


xpM 1 -like viruses 

4-M1 

1 

3 

Methanobacterium b 

Podoviridae 

T7-like viruses 

T7 

3 

26 

Enterobacteria 


P22-like viruses 

P22 

1 

11 

Enterobacteria 


4>29-like viruses 

4>29 

4 

12 

Bacillus 


“Parentheses indicate approximate numbers of poorly characterized isolates that may or may not represent independent species. 
b Archaea. 


known leviviruses are plasmid-specific coliphages 
that adsorb to F or sex pili. They have been divided, by 
serology and other criteria, into two genera. Several as 
yet unclassified leviviruses are specific for other plasmid 
types (e.g., C, H. M) or occur outside of the enterobacteria 
family (table 2-4). 

Cystoviridae (dsRNA) 

The family Cystoviridae consisted until recently of a single 
member, but several related viruses have recently been 
found. Cystoviruses are unique among bacteriophages 
because they contain three molecules of dsRNA and RNA 
polymerase. They have lipid-containing envelopes and have 
no morphological resemblance to other viruses with dsRNA, 
such as reoviruses or totiviruses (chapter 16). 

Phages with Helical Symmetry 

Inoviridae (ssDNA) 

The family Inoviridae has two genera with very different 
host ranges. Their similarities in replication and morpho¬ 
genesis seem to derive from the single-stranded nature of 
phage DNA rather than from a common origin. Despite the 
absence of lipids, virses are chloroform sensitive. The 
Inovirus genus includes 42 phages that are long, rigid, or 
flexible filaments of variable length and have been classified 
into 29 species by particle length, coat structure, and DNA 
content. They occur in a few Gram-negative bacteria 
and also in Clostridium and Propionibacterium. Viruses are 
sensitive to sonication and very resistant to heat. Many 
of them are plasmid-specific. The Plectrovirus genus includes 
15 isolates. Phages are short, straight rods and occur in 
mycoplasmas only. 


Lipothrixviridae (dsDNA) 

The family Lipothrixviridae includes four viruses of the 
archaebacterial genus Thermoproteus. Particles are charac¬ 
terized by the combination of a lipoprotein envelope and 
rodlike shape. 

Rudiviridae (dsDNA) 

The family Rudiviridae includes two viruses of different 
length that have been isolated from thermophilic archaebac- 
teria. Particles are straight, rigid rods without envelopes and 
closely resemble the tobacco mosaic virus. 


Table 2-4 Host Range of Phages by Group and Host Genus 


Phage group 
or family 

Eubacteria 

Archaea 

Caudovirales 

Any (ubiquitous) 

Extreme halophiles 
and methanogens 

Microviridae 

Enterobacteria, Bdellovibrio, 
Chlamydia, Spiroplasma 


Corticoviridae 

Alteromonas 


Tectiviridae 

a. Enterics, Acinetobacter, 
Pseudomonas, Thermus, 
Vibrio 

b. Bacillus, Alicyclobacillus 


Leviviridae 

Enterics, Acinetobacter, 
Caulobacter, Pseudomonas 


Cystoviridae 

Pseudomonas 


Inoviridae 

Enterics, Pseudomonas, 


Inovirus 

Thermus, Vibrio, 
Xanthomonas 


Plectrovirus 

Acholeplasma, Spiroplasma 


Plasmaviridae 

Acholeplasma 


Lipothrixviridae 


Acidianus, Sulfolobus, 
Thermoproteus 

Rudiviridae 


Sulfolobus 

Fuselloviridae 


Acidianus, Haloarcula, 
Sulfolobus 
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Pleomorphic Phages 

Plasmaviridae (dsDNA) 

Only one certain member of the Plasmaviridae is known: 
Acholeplasma virus MVL2 or L2. It contains dsDNA, has no 
capsid, and may be called a nucleoprotein granule with a 
lipoprotein envelope. Four similar isolates are known, but 
one of them has been described as containing ssDNA and 
they cannot be classified. 

F uselloviridae (dsDNA) 

The best-known member of the family Fuselloviridae, SSV1, 
persists in the archaeon Sulfolobus shibatae both as a plasmid 
and as an integrated prophage. It has been produced upon 
induction, but has not been propagated for absence of a 
suitable host. Particles are lemon-shaped with short spikes 
at one end. The coat consists of two hydrophobic proteins 
and host lipids. It is disrupted by chloroform. 

“Guttavirus” (dsDNA) 

The name “guttavirus”designates a droplet-shaped virus-like 
particle, named SDNV which has been found in a Sulfolobus 
culture from a solfatara in New Zealand. It has received the 
status of an unassigned genus (14). 


Taxonomical Physiology 

Taxonomic subdivisions correlate with differences in 
physiology and life-style (2,14) For example, most bacterial 
viruses infect bacteria from the outside after adsorption 
to the cell wall, capsules, pili, or flagella. However, cystovirus 
capsids, after losing their envelope, enter the space between 
cell wall and cytoplasmic membrane, while plasmaviruses 
infect their hosts by fusion of the viral envelope and 
mycoplasmal host cell membranes. Tailed phage DNA repli¬ 
cation resembles somewhat that of herpesviruses (1) and is 
characterized by particulars which, to our knowledge, 
are not found in other phages. It generally starts at 
fixed sites of the DNA molecule, is bidirectional, and often 
generates giant DNA molecules, called concatemers, which 
are then cut to fit into phage heads. Gene expression is 
largely sequential. In phages with ssDNA, double-stranded 
replicative forms (RF) are produced. RNA molecules are 
never circularized. The RNA of leviviruses acts as mRNA 
and needs no transcription. Microviridae, Leviviridae, 
the Inovirus genus, and Fuselloviridae have overlapping 
genes that, by translation in different reading frames, 
allow the synthesis of several proteins from the same DNA 
or RNA segment. 

In most phage families, the newly synthesized nucleic 
acid enters a preformed capsid; however, the levivirus 


capsid is constructed around or co-assembled with 
phage RNA. Assembly of tailed phages is a complex process 
and involves proteins acting in sequence and separate 
pathways for heads and tails, which are finally joined 
together. Tailed phage assembly often results in aberrant 
structures, for example giant or multi-tailed phages 
and tubules of polymerized head or tail protein, called 
polyheads, polytails, or polysheaths. Inoviruses frequently 
produce particles of abnormal length. The envelope of 
plasmaviruses is acquired by budding, but that of cystovirus 
(j) 6 is of cellular origin. 

Further differences are seen in the mode of release. Lysis 
occurs in tailed and icosahedral phages and in the Lipothrix- 
viridae. Bacterial cells are weakened from the inside and 
burst, liberating some 50 to 200 phages (sometimes many 
more). Inoviruses and fuselloviruses are liberated by extru¬ 
sion, with phages being secreted through the membranes of 
their surviving hosts. Budding through bacterial mem¬ 
branes is found in plasmaviruses. Their host cells are not 
lysed and may produce phages for hours. 

Lysogeny is widespread and not confined to tailed phages 
(1). Icosahedral phages are always virulent. About 50% of 
tailed phages are temperate. Lysogeny also occurs in the 
Inovirus genus and the Lipothrixviridae, Plasmaviridae, and 
Fuselloviridae families. Lysogeny is near-ubiquitous in the 
bacterial world and exists in eubacteria and in archaebac- 
teria. Its frequency in a given bacterial species varies 
between 0 and 100% (often 40%) according to the species 
and quality of investigation. Tailed phages exhibit three 
types of lysogeny, exemplified by phages X, PI, and Mu 
(chapter 7). The X and PI types seem to be equally frequent 
in tailed phages. The Mu type is exceptional in bacterial 
viruses. Remarkably, the integrase-mediated X type also 
occurs in the Fuselloviridae and Plasmaviridae families (1). 
Phages of the genus Inovirus integrate into the host genome 
by means of host recombinases (7). The type of lysogeny in 
the Lipothrixviridae is unknown. 

Host Range and Its Evolutionary 
Implications 

Phages have been found in over 140 bacterial genera 
from all parts of the bacterial world: in archaebacteria 
and eubacteria, in aerobic, anaerobic, appendaged, budding, 
gliding, ramified, sporulating, or sheathed bacteria, in 
cyanobacteria, spirochetes, mycoplasmas, and chlamydias 
(table 2-5). Podovirus particles have even been observed in 
bacterial endosymbionts of paramecia. Indeed, bacterioph¬ 
age hosts are represented in nearly all sections of Bergey’s 
Manual (3). Tailed phages described in cultures of green 
algae and filamentous fungi are probably contaminants. 

When phages are grouped according to phylogenetic 
bacterial groups established by rRNA sequencing (16), 
it becomes apparent that most major bacterial phyla are, 
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Table 2-5 Frequency of Phages in Bacterial Phylogenetic Divisions 


Division or subdivision 

Myoviridae 

Siphoviridae 

Podoviridae 

Total of tailed phages 

CFP 

Archaea 

Euryarcheota 

7 

7 


14 

4 

Crenarcheota 

Bacteria 





14 a 

Bacteroides-Cytophaga 

—Flavobacterium 

34 

24 

1 

59 

2 

Chlamydiales 

Cyanobacteria 

22 

6 

16 

44 

2 

Deinococcus-Thermus 

Firmicutes 

8 

6 


14 

4 

High C-C branch 

3 

482 

21 

506 


Low C-C branch 

371 

1300 

86 

1757 

31 

Fusobacteria 


1 

4 

5 


Proteobacteria: all 

757 

729 

536 

2022 

128? 

y subdivision only 

642 

530 

429 

1601 

111? 

Spirochetes 

10 

1 


11 


Total 

1212 

2555 

660 

4427 

186? 


Based on the organism list of the National Center for Biotechnology Information (NCBI) database (http://www.ncbi.nlm.nih.gov/taxonomy/) (15) and on phage 
counts from (3). Total phage numbers are lower than in table 2-1 because rRNA data are unavailable for some phage hosts. 

CFP, cubic, filamentous, and pleomorphic phages. 

“Including a phage-like particle. 


so far, without phages. Most phages are found in easily 
grown and medically or industrially important bacteria: 

1. Firmicutes with high G-C (coryneforms, mycobacteria, 
streptomycetes), 

2. Firmicutes with low G-C (bacilli, lactobacilli, lacto- 
cocci, Clostridia, staphylococci, streptococci), 

3. Proteobacteria, especially of the y subdivision. The 
latter includes enterobacteria (over 800 phage obser¬ 
vations) and pseudomonads. 

Tailed phages predominate almost everywhere and occur in 
both eubacteria and Euryarcheota, suggesting that they 
originated before separation of these bacterial kingdoms. 
Siphoviridae are particularly frequent in actinomycetes, 
coryneforms, lactococci, and streptococci. Myoviruses 
and podoviruses are relatively frequent in enterobacteria, 
pseudomonads, bacilli, and clostridia. This distribution 
is probably related to features of bacterial speciation, for 
example to evolution of particular receptors or restriction 
endonucleases. 

There is a chasm between the ubiquitous tailed phages 
and the other types of bacterial viruses. The latter are rela¬ 
tively rare and have narrow, particular host range. 
Inoviruses of the Plectrovirus genus are restricted to myco- 
plasmas, and Fuselloviridae, Lipothrixviridae, and Rudiviridae 
are limited to a particular archaeal subdivision, the extre¬ 
mely thermophilic Crenarcheota. This suggests that their 
hosts constitute ecological niches, in which these viruses 
originated. On the other hand, there are oddities in distribu¬ 
tion: (i) filamentous inoviruses, though generally associated 
with enterobacteria and their relatives, occur in such diverse 
bacteria as Clostridium, Propionibacterium, and Thermus; 


and (ii) tectiviruses are found in enterics, bacilli, and 
Thermus. As these phages are plasmid-dependent, their 
distribution may be explained by plasmid transfer (3). 

Why Phage Classification? 

To microbiologists focusing on a single microorganism, clas¬ 
sification may appear a sterile exercise. Others, impressed by 
spectacular advances in limited fields, may consider that 
“we know it all.” Indeed, bacteriophages T4 and X are 
among the best-known viruses of all. This way of thinking 
is short-sighted and self-defeating. We have barely scratched 
the surface of virology. With respect to bacteriophages, 
(i) research has concentrated on a few viruses and neglected 
the 5000 others known, (ii) research is limited to about 15 
countries in the world, and (iii) important habitats such as 
hot springs or fermentors have been little explored. My 
experience is that, armed with an electron microscope, I am 
certain to find scores of new phages in a bottle of sewage. At 
the same time, phages have practical applications that 
demand precise identification. Last but not least, microbiol¬ 
ogists want to understand the living world and many of 
them are actively engaged in teaching. For them, and their 
students, taxonomy is simplification. It is impossible and 
pointless to memorize the properties of 5000 individual 
tailed phages, but it is much more rewarding to study tailed 
phages as a group. The advantages of classification, espe¬ 
cially with respect to phages, are many: 

1. Classification is generalization and simplification. 

a. It promotes comparison and thus virus research 
and better understanding of the viral world. 
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Figure 2-2 Selected myoviruses. A: Bacillus megaterium 
phage G, the largest myovirus known; note the spiral 
filament around the tail. B: Giant unknown bacteriophage 
found in macerated debris of Bombyx mori ; note the wavy 
tail fibers. Phage heads of this size are easily deformed. 

C: Bdellovibrio bacteriovorus phage <|>1402, the smallest 
myovirus known (isolated by B. A. Fane, Department of 
Veterinary Science, University of Arizona, Tucson, AZ). 
x 297,000; bar indicates lOOnm. Phosphotungstate 
(2%, pH 7.2) (A, B)and uranyl acetate (2%, pH 4.0) (C). 


Figure 2-3 Myoviruses (A-C) and siphoviruses (D, E). 

A: Staphylococcus hyicus phage Twort. B: Coliphage RB69. 
The head of the left particle is slightly deformed and 
the tail of the right particle is contracted. C: “Killer 
particle” of Bacillus subtilis ; note the small, bacterial 
DNA-containing head. D: Phage NM1 of Sinorhizobium 
meliloti with transverse tail disks and globular fixation 
structures. E; Bacillus subtilis phage 4>105. x 297,000; 
bar indicates lOOnm. Uranyl acetate (A, C, E) and 
phosphotungstate (B, D). 


b. It is indispensable for teaching. No textbook is 
conceivable without it and students, as teacher 
well know, want certitude. Taxonomy is also a 
basic ingredient in theses. Without classifica¬ 
tion, virology is just a magma for the beginner. 

2. Classification is essential for phylogenetic studies. It 
identifies the very biological groups that are to be 
compared. For example, isolated genes or genome 
sequences are of limited interest, but gain immensely 
in significance when traced to other categories of 
organisms. Without phage classification, we would 
probably not know that such things as horizontal 
gene transfer exist. 

3. Classification is required for proper identification of: 

a. New phages. 

b. Harmful phages in biotechnology and industrial 
fermentations. 


c. Industrially important phages in patent appli¬ 
cations. 

d. Therapeutic phages. Now that phage therapy is 
making a comeback, they must be identified. 

It is indeed inconceivable that people should 
be treated (even injected) with something 
unknown. 

4. Classification is a great research aid. For example, the 
DNA size of tailed phages can be predicted with accu¬ 
racy from capsid dimensions. 


Outlook 

Phage taxonomy is bound to evolve. New phage families 
are likely to be discovered in unusual habitats, such as 
volcanic springs, hypersaline lagoons, or the mammalian 
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Figure 2-4 Siphoviruses (A, B) and podoviruses (C-F). 

A: Staphylococcus aureus phage 6. B: Bacillus subtilis 
phage BS5. C: Lactococcus lactis ssp. cremoris phage KSY1. 
D: Salmonella typhimurium phage P22. E: Bacillus sp. 
phage GA-1. F: Coliphage Esc-7-11. x 297,000; bar indicates 
lOOnm. Uranyl acetate (A, B, F) and phosphotungstate 
(C, D, E). 


Figure 2-5 Selected isometric and filamentous phages. 

A: E. coli microvirus 4>X174, x 297,000. B: Tectivirus 37-64 
of Thermus sp.; note inner membrane and deformed 
particles; x 297,000 C: E. coli levivirus R17 adsorbed to 
pili, x 148,500 D: Inovirus H75 of Thermus thermophilus, 
x 92,400. Bars indicate lOOnm. Uranyl acetate (A) and 
phosphotungstate (B-D). 


rumen. Indeed, strange phage-like particles that probably 
represent novel families have been reported many times, for 
example arrow-shaped, apparently archaebacteria-related 
particles in saline environments (18). Families are the 
most stable part of the present edifice of viral taxonomy. 
Genera and species are more fluid because there are no 
universally accepted or applicable criteria for defining 
them. Mayr’s “biological species definition” (11), based on 
interbreeding with production of fertile offspring, was 
devised for songbirds and is almost inapplicable to viruses. 
The polythetic species concept (13), which is in principle 
applicable to genera, is a concept and does not provide 
concrete guidelines for definitions of phage species or 
genera. I contend that no biologist (or virologist) can certify 
what a species is. Virus (or phage) classification is therefore 
still an art. 

The most important present challenge is the interpreta¬ 
tion of horizontal gene transfer in tailed phages. It may be 
argued that any classification of these viruses is impossible 
because of gene shuffling and the occurrence of certain 
genes in apparently unrelated viruses. This attitude is a 


dead end which can only lead to forfeiting the 
benefits of classification. It overlooks major realities of viral 
biology: (i) Common genes do not necessarily indicate 
close relationships. Horizontal gene transfer is frequent 
and ubiquitous. Some (all?) genes move through the living 
world; for example T4 lysozyme surfaces in goose eggs 
and human tears (1). (ii) Recombination can mask or 
simulate relationships, (iii) In viruses related by evolution, 
genomic relationships between different taxa are to be 
expected. The question of devising genera or species in 
tailed phages thus becomes a quantitative one. We may 
have to follow the bacteriologists who, arbitrarily, fixed 
the threshold value for species delineation at 60-70% 
DNA homology (8). 
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Prophage Genomics 

HARALD BRUSSOW 


A t the time of this writing, the GenBank phage database 
comprises 200 complete phage genome sequences. An 
equivalent number of phage sequences were passively 
acquired as prophages in bacterial genome sequencing 
projects. The scientific value of these prophage sequences 
was only recently recognized (13, 14). It goes beyond their 
potential to double the content of the current phage data¬ 
base and to correct a bias of the database towards selected 
phage systems (coliphages, dairy phages, mycobacterio- 
phages) (7, 9). These prophage sequences allowed a first 
insight into the evolution of phage-host genome interac¬ 
tions at a molecular level. This analysis turned out to be 
especially fruitful for bacterial pathogens (1, 5,49). 

Technical Difficulties 

Few phage genes are sufficiently conserved and distinct 
from bacterial genes to serve as markers for prophage 
sequences. For temperate Siplwviridae, suitable phage 
proteins comprise the phage integrase (10), the portal 
protein, the terminase (17), and the tail tape measure 
protein. A computer program (25) combining semantic 
searches of the gene annotations with BLAST searches 
identified automatically most of the prophages compiled by 
Casjens in a more labor-intensive approach (14). However, 
other types of phages can integrate their DNA into 
the bacterial chromosome: for example, P2- and Mu-like 
Myoviridae, Inoviridae in Vibrio and Xanthomonas, Plasmavir- 
idae in Acheloplasma. In addition to psiMl-like Siplwviridae, 
Lipothrix- and Fuselloviruses integrate their genomes into 
the chromosomes of Archaea. Still other forms of lysogeny 
exist that do not lead to the integration of phage DNA into 
the bacterial chromosome: for example, prophages PI and 
N15 are maintained as circular or linear plasmids, respec¬ 
tively. However, different groups of temperate phages show 
a remarkable conservation of their gene map (synteny) that 
allowed a tentative identification of prophage sequences 
with only a few landmark database matches (8, 16, 35). 
A final difficulty with the detection of prophage sequences 


in bacterial genomes is the trend for progressive deletion 
and rearrangements of prophage DNA (18, 31). 

Distribution of Prophage Sequences 

When published bacterial genomes were screened for proph¬ 
age sequences, approximately half of them scored positive 
(13, 14, 31). A clear bias was observed in the distribution: 
most Archaea (only one subgroup showed phagelike viruses) 
and intracellular bacterial parasites lacked prophages, 
while many bacterial pathogens of humans, animals, and 
plants frequently showed a high prophage content. When 
all prophage sequences were located on an idealized 
circular bacterial genome map, no preferred positions of 
the prophages was observed. However, when individual 
bacterial groups were investigated, clear biases in the 
prophage distribution were observed. For example, proph¬ 
ages from Escherichia coli showed a trend for location in 
the first half of the chromosome (i.e., between the origin 
and the terminus of bacterial replication), while prophages 
from Streptococcus pyogenes tended to cluster at both sides 
of the terminus of replication (13). Interestingly, proph¬ 
ages from low G-C content, Gram-positive bacteria were 
differently oriented when located to the left or the right 
of the terminus of bacterial replication. The lytic gene 
cluster always pointed in the direction of the majority of 
the surrounding bacterial genes. 

In highly prophage-infested bacterial genomes such as 
E. coli 0157 (37), Lactococcus lactis (15), and Streptococcus 
pyogenes (3, 22, 41) up to 8% of the bacterial chromosome 
consisted of prophage sequences. Where multiple strains of 
the same bacterial species were sequenced, prophage DNA 
accounted for a substantial amount of inter-strain genetic 
variability. For example, genomic comparison between the 
pathogenic E. coli strain 0157 EDL933 and the laboratory 
E. coli strain K-12 revealed 4.1 Mb of common chromosome 
backbone sequence, 1.3 Mb of 0157-specific, DNA and 
0.5 Mb of K12-specific DNA (37). Approximately half of 
the 0157-specific DNA was prophage DNA. Even more 
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extreme cases are known: when genomes from different 
M serotypes of S. pyogenes were compared the major gaps 
in the dot-plot DNA sequence alignment were nearly 
exclusively prophage sequences. Similarly, DNA-DNA 
hybridizations in the microarray format revealed that 
prophages frequently made important contributions to 
the strain-specific gene complement of the sequenced refer¬ 
ence strain. This varied from less than 10% in bacteria 
containing only one or a few prophage remnants, as in the 
currently sequenced high G-C content, Gram-positive 
bacteria ( Mycobacterium , Bifidobacterium), to major contri¬ 
butions in low GC content, Gram-positive bacteria, where 
the percentage ranged from about 30% in S. agalactiae 
(2 prophages, (43)) to about 50% in Lactobacillus johnsonii 
(2 prophages, (19, 46)) and up to 90% in S. pyogenes strains 
containing three to six prophages (41). 

Theoretical Framework 

The peculiar life-style of temperate phages makes them 
model systems for addressing a number of fundamental 
questions in evolutionary biology. The viral DNA undergoes 
different selective pressures when replicated as phage DNA 
during lytic infection cycles than it does as prophage DNA 
maintained in the bacterial genome during lysogeny. Darwi¬ 
nian considerations along the lines of the selfish gene 
concept lead to interesting conjectures (9, 13, 18, 31). One 
could anticipate that the prophage decreases the fitness of 
its lysogenic host by at least two processes: first by the meta¬ 
bolic burden of replicating extra DNA and second by the lysis 
of the host after prophage induction. To compensate for 
these disadvantages one has to hypothesize that temperate 
phages encode functions that increase the fitness of the lyso- 
gen. According to the selective value of these phage genes, 
the lysogenic cell will be maintained or even be over repre¬ 
sented in the bacterial population. An obvious selective 
advantage for the lysogenic host is the immunity and super¬ 
infection exclusion genes of the prophage that protect the 
lysogen against phage infection. These genes are also of 
direct advantage to the prophage since they exclude foreign 
phage DNA from competing with the resident prophage DNA 
for the same host. Where phages from the environment are 
not a sufficiently strong selection pressure, other phage 
genes have to increase the fitness of the lysogenic host, 
frequently in rather unanticipated ways (lysogenic conver¬ 
sion genes). Classic examples of such phage-encoded 
genes that increase host fitness include diphtheria toxin, 
streptococcal erythrogenic toxin A, and the nonessential 
phage X gene bor that confers serum-resistance to the E. coli 
lysogen (2). Interestingly, lysogenic E. coli clones were also 
more competitive than prophage-free clones in laboratory 
growth (21). In these cases, the reproductive success of the 
lysogenic bacterium translates directly into an evolutionary 
success for the resident prophage. 


However, host-parasite relationships are also an 
arms race and represent therefore a highly dynamic genetic 
equilibrium. Gains from prophages carrying genes that 
increase host fitness are short-lived from a bacterial stand¬ 
point if the resident prophage ultimately destroys the bacter¬ 
ial lineage. In this way, prophages can be considered to be 
dangerous molecular time bombs that can kill the lysogenic 
cell upon their eventual induction (31). One would therefore 
expect evolution to select lysogenic bacteria with mutations 
in the prophage DNA. Mutations that inactivate the proph¬ 
age induction process avoid the loss of the lysogenic clone 
from the bacterial population. In a next step, one would 
expect that selection pressures lead to large-scale deletion 
of prophage DNA in order to decrease the metabolic burden 
of extra DNA synthesis. One predicts furthermore that 
useful prophage genes (e.g., lysogenic conversion genes) are 
preferentially spared from this deletion process since their 
loss would actually decrease the fitness of the cell. It was 
proposed that a high genomic deletion rate is instrumental 
in removing dangerous genetic parasites from the bacterial 
genome. Deletion processes could explain why the bacterial 
genomes did not increase in size despite a constant bom¬ 
bardment with parasitic DNA over evolutionary time 
periods. The streamlined bacterial chromosome containing 
few pseudogenes might be the consequence of this deletion 
process of parasitic DNA (31). 

Prophages in Streptococcus Pyogenes 

An interesting test case for the predictions of the theoretical 
model is the important human pathogen S. pyogenes. The 
invasive S. pyogenes Ml strain SF370 contains eight proph¬ 
age elements (18, 22) (figure 3-1). Only prophage SF 370.1 
could be induced by mitomycin C treatment. This prophage 
is a pac-site temperate member of the Siphoviridae found in 
many different lactic acid bacteria. Its closest relative was 
prophage NIH1.1 from a Japanese M3 S. pyogenes strain 
(12, 27). Notably, possession of prophage NIH1.1 differen¬ 
tiated older from newly emerging S. pyogenes strains in 
Japan (28). Prophage acquisition might thus be a major 
mechanism of short-term evolution in this epidemiologically 
highly dynamic bacterial species. In fact, extensive geno¬ 
mics analysis combined with epidemiological surveys of 
S. pyogenes isolates over the last decades by clinical micro¬ 
biologists led to an attractive model. James Musser and 
colleagues suggested that the recently emerged unusual 
virulent subclones of M3 S. pyogenes strains are the result 
of sequential acquisition of three prophages and their asso¬ 
ciated virulence genes over the last 80 years (1, 3). If this 
observation can be generalized to other bacterial pathogens, 
medical microbiologists have to confront foes that evolve in 
the fast lane. Like in the acquisition of antibiotic resistance 
genes, the development of bacterial pathogenicity seems 
to rely heavily on the use of mobile DNA elements. It is 
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Figure 3-1 Location of Streptococcus pyogenes strain SF370 prophage and their genomic maps. Left: Location and relative size 
(in kilobases) of prophages and prophage remnants R on the S. pyogenes strain SF370 genome map. The linked arrows 
indicate a possible site for homologous recombination between closely related prophage DNA segments. Right: The genome 
maps of three SF370 prophages and a prophage from S. pyogenes strain NIH1 are aligned with the attachment sites at the left 
and right ends. Prophage NIH1.1 has been identified as a genetic marker for recently emerged clinical isolates of S. pyogenes 
in Japan. The phage modules are: lysogeny, DNA replication, probable transcriptional regulation, DNA packaging and head, 
morphoggenesis, head-to-tail joining, tail synthesis, tail fiber synthesis and assembly, lysis, and lysogenic conversion genes 
encoding superantigen/mitogenic factors. Large vertical arrows indicate mutations that are likely to inactivate the prophage. 
An asterisk marks phage genes that potentially contribute to the virulence of the lysogenic host. The phage hyaluronidase 
is labeled by a triangle. Regions of DNA seguence similarity between the prophages are connected by shading (for details 
see 13). See thebacteriophages.org/frames_0030.htm for a color version of the figure. 


currently unclear why resistance genes come with plasmids 
and transposons, while virulence genes show a tendency to 
travel with the genomes of temperate phages. 

All 20 complete prophages detected in the currently 
sequenced S. pyogenes strains showed likely lysogenic 
conversion genes between the phage lysin gene and the 
right attachment site. Prophage SF 370.1 encoded the pyro¬ 
genic exotoxin C and a mitogenic factor (22). Related 
proteins were encoded in the other S. pyogenes prophages 
covering distinct members of superantigens, mitogenic 
factors (DNases), and toxic enzymes. These phage proteins 
may contribute to the immune deregulation observed 
during invasive streptococcal infections. The lysogenic 
conversion genes in the prophages differ in their G-C content 
from the surrounding prophage and bacterial DNA (22). 
Their location in the vicinity of the phage attachment site 
suggested a faulty phage excision process in an unusual 
bacterial host of lower G-C content as the origin of this 
DNA. The subsequent spread of these genes is also suggested 
by the presence of sequence-identical genes in the veterin¬ 
ary pathogen S. equi. Conceptually, bacteria that can acquire 
multiple prophages with related but distinct forms of 
superantigens or other virulence factors can play a type of 
combinatorial biology for the construction of a successful 
pathogen. 

Streptococcus pyogenes showed other elements predicted 
by theory, for example prophage inactivation. Prophage 
SF 370.3 showed a 33 kb-long genome that closely resembled 


the genome organization of the cos-site temperate Lactococ- 
c.us lactis phage rlt (18). Analysis of the prophage genome 
revealed mutations in the replisome organizer gene that 
may prevent the induction of the prophage. Prophage 
SF 370.2 showed a 43 kb-long genome that again resembled 
the genome organization of pac-site temperate Siphoviridae 
infecting dairy bacteria. SF 370.2 showed two inactivating 
mutations (one in the replisome organizer gene and another 
in the gene encoding the portal protein) and four DNA inser¬ 
tions into the prophage DNA (18) (figure 3-2). 

A clear trend for prophage genome reduction was 
documented by the many S. pyogenes prophage remnants 
that are probably the result of massive losses of proph¬ 
age DNA (18). The largest prophage remnant, SF370.4, 
showed a 13 kb-long genome consisting of lysogeny, DNA 
replication, and transcriptional regulation genes still 
flanked by attachment sites. Other prophage remnants 
were much smaller and consisted frequently only of the 
phage integrase and one or a few associated prophage 
genes (a cl-like phage repressor was a frequent finding). 

In S. pyogenes, prophages are apparently hotspots 
for genetic recombination. A particularly conserved 
genome region is the tail fiber gene cluster, especially 
around the hyaluronidase gene. This tail fiber protein 
must provide the phage with access to the target cell 
through the hyaluronic-acid-containing bacterial capsule. 
Some post-streptococcal patients mount antibodies 
against this phage protein. Since hyaluronic acid is also 
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Figure 3-2 Alignment of the phage genome maps. Top: Lactococcus lactis phage Tuc2009. Middle: Streptococcus pyogenes 
prophage SF370.2. Bottom: Streptococcus thermophilus phage 01205. Open reading frames were depicted that exceeded 60 
codons in length and that started with an ATG codon (exceptions are ORFs 33 and 34 which start with GTG). The genes were 
identified by their codon length or were numbered starting with the integrase gene. Selected proteins were annotated with 
their putative functions. The genes are connected with shading when the predicted proteins shared significant sequence 
similarity. Selected similarities are documented by percent amino acid identity and their log BLAST E-values. Putative 
insertions are noted II to 14. Possible rho-independent terminators are indicated with a hairpin. The wavy line provides the 
GC content curve for SF370.2 over the genome scale in base pairs. See thebacteriophages.org/frames_0030.htm for a color 
version of this figure. 


part of the connective tissue, this immune response seems to 
suggest that this prophage protein might assist in the propa¬ 
gation of the pathogen along tissue planes of patients. 

The conserved DNA segments around the hyaluronidase 
gene and short conserved DNA segments near the attR sites 
might be targets for homologous recombination between 
the prophage and a superinfecting phage and thus result 
in the reshuffling of virulence genes between phages. 
However, the region around the phage hyaluronidase genes 
might serve also as target sites for homologous recombina¬ 
tion between two prophages residing in the same host 
chromosome. This type of recombination will then result 
in rearrangements of the bacterial genome (9). The genome 
comparison of a US and a Japanese M3 serotype S. pyogenes 
strain provided convincing evidence for such a prophage- 
mediated bacterial genome rearrangement event (36). 

Prophages in the Genomes of Dairy 

Bacteria and Gut Commensals 

Lactococcus lactis is a major bacterial starter organism used 
in industrial cheese fermentation. Strain IL1403 contains 
in its 2.3 Mb genome six prophage elements; all were flanked 
by the phage attachment sites and with one exception could 
be excised from the bacterial genome (15). However, only two 


prophages gave rise to infectious phage particles after pro¬ 
phage induction. Two prophages resembled in their genome 
organization the temperate cos-site L. lactis phage BK5-T, an 
Sfi21-like siphovirus. In the lysogen, only two genome 
regions of prophage BK5-T were transcribed (4): one tran¬ 
script started in the cf-like repressor gene extending 
into the lysogeny module, while another transcript 
was located between the lysin gene and the right attach¬ 
ment site, the region where S. pyogenes prophages encode 
virulence factors. Three 15 kb-long DNA segments in 
IL1403 represent likely prophage remnants. All had 
conserved part of the lysogeny module (integrase/repressor) 
and, in variable amounts, DNA replication and a few 
structural genes (15). 

Many temperate Siphoviridae from a wide range of low 
GC content, Gram-positive bacteria showed extra genes 
between the lysin gene and attR. Prophages from the patho¬ 
gen Staphylococcus aureus encoded virulence genes near attR 
(leukotoxins, enterotoxins, exfoliative toxins, staphyloki- 
nase) (13). Two different types of prophages from Streptococ¬ 
cus thermophilus. the starter used for yogurt fermentation, 
showed genes without database matches in that region. 
Transcription analysis demonstrated that these phage 
genes were the only prominently expressed prophage genes 
in addition to the cl-like repressor and the superinfection 
exclusion genes (45). The rest of the prophage genome is 
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Figure 3-3 Transcription maps of temperate Streptococcus thermophilus phages. Top: Genome map of phage Sfi21 as found 
in the extracellular phage particle. The individual phage modules as determined by bioinformatic analysis are marked with 
the brackets above the map. The arrows below the map indicate the transcripts: early, middle, and late. The mRNA length in 
kilobases is indicated by the numbers above the arrows. The 5' start sites, determined by primer extension analysis, are 
marked by a P in a circle. The width of the arrow indicates the strength of the autoradiographic signal as determined by 
Northern blots. The wavy lines indicate smeared hybridization results with a sharp upper end (48). Bottom: Genome maps of 
S. thermophilus prophages 01205 and Sfi21 as they are integrated in the bacterial chromosome. The arrows under the map 
mark the transcribed prophage genes (45). Note that the transcription of the prophages is limited to the regions near both 
attachment sites. See thebacteriophages.org/frames_0030.htm for a color version of this figure. 


transcriptionally silent, well in contrast to lytic infections 
engendered by the same phages. Notably, these candidate 
lysogenic conversion genes belonged to the early genes tran¬ 
scribed in the lytic infection cycle (48) (figure 3-3). However, 
database matches provided no hints for a potential function 
of these genes. Interestingly, S. thermophilus prophage TP- 
134 that integrates into a tRNA gene without disrupting it, 
showed a lysogenic conversion phenotype, that is a change 
in growth property (aggregated vs. planktonic growth) 
(H. Neve, personal communication). A prophage remnant 
was also detected in S. thermophilus (45). The phage attach¬ 
ment sites flanked the phage integrase and two nonattribu- 
ted but transcribed prophage genes, suggesting a sparing of 
potential lysogenic conversion genes from the deletion 
process as predicted by the theoretical framework. 

The prophages from lactobacilli, a third group of dairy 
bacteria and important commensals of humans and 


animals, were investigated in some detail (see chapter 41). 
The prophages from a Lactobacillus plantarum strain are 
instructive in the present context (47). Two prophage regions 
contained transcribed extra genes. They were located 
between the phage lysin and attR and between the phage 
repressor and the integrase genes. Notably, several of these 
extra genes shared 30-40% protein sequence identity with 
candidate virulence genes from bacterial pathogens, most 
prominently with candidate lysogenic conversion genes 
from S. pyogenes prophages (47). In addition, Lactobacillus 
prophages contained transcribed tRNA genes. 

Prophages in E. coli 

The laboratory E. coli strain K-12 contains, in addition 
to phage X, eight prophage remnants (11). Three are 
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Figure 3-4 Dot-plot matrix for phage X and the 11 X-like prophage elements. Prophage elements, Sp3 to Spl 5 from the 
E. coli strain 0157:H7 Sakai. Phage X is given as the conventional vegetative map (starting with the cos-site and the 
packaging genes; chapter 27) while the Sakai prophages start with attL and the integrase gene. All prophages except Sp6 
share DNA seguence identity with the genes of X. Sp6 is clearly distinct from X at the DNA level while it shares along regions 
of DNA identity with other Sakai prophages. Remarkable is the high degree of DNA sequence identity between the different 
Sakai prophages. The sequence identity should lead to frequent genome rearrangements due to homologous recombination 
between the prophage elements. 



sequence-related to prophages integrated at analogous loci 
in the pathogenic 0157 E. coli strains (26, 39). However, 
933R prophage in 0157 is far more complete than its Rac 
homolog in K-12 and apparently misses only the N-kil gene 
region. Shigella Jlexneri, which separated from K-12 less than 
1 million years ago (29), can be aligned with K-12 at the DNA 
level. Some of the gaps in the alignment were made up by 
many likely prophage remnants. The lack of alignment of 
prophage remnants between these closely related strains 
suggests that the prophage decay process must occur over 
short evolutionary time periods. 

In the 0157 strain EDL933,12 prophage sequences were 
identified. Only the Shiga-toxin-converting phage 933W 
can produce infectious particles (40). Among the 18 
prophages of the 0157 Sakai strain, 11 are /.-like phages that 
share large regions of high DNA sequence identity among 
themselves and phage X (figure 3-4). It is currently not 
clear whether this intriguing similarity is the result of 
infection by closely related phages or propagation by a 


copy- and -paste mechanism within this cell lineage followed 
by some diversification via modular exchanges. Shared 
prophage sequences are associated with extensive genome 
rearrangements in Xglella fastidiosa, a Gram-negative plant 
bacterial pathogen known from different pathovars (44). 

The remaining 0157 prophages distantly resemble 
phages P2 and P4 and closely resemble phage Mu. They all 
contain frameshift mutations and various types of deletions 
and insertions of IS elements and no infectious phage could 
be induced (33, 51). 

The Shiga toxin Stx2- producing prophages in the two 
0157 strains are integrated into the same locus wbrA and 
their DNA sequences are nearly identical over 85% of the 
genome (33, 40). They differ over the genetic switch and the 
DNA replication region. Both phages carry additional viru¬ 
lence factors (, bar ,; lorn) and a toxin/antitoxin system used by 
plasmids to ensure their maintenance. Numerous potential 
virulence factors are also encoded by further prophages 
from both 0157 strains: Shiga toxin Stxl (933 V Spl5), an 
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intestinal colonization factor (933 0), and superoxide dismu- 
tase (Sp 4,10) (37). The potential lysogenic conversion genes 
were located in a few preferred prophage genome positions: 
in decreasing frequency downstream of the O-like and 
N-like antiterminator and the tail fiber genes (5). This obser¬ 
vation suggests different acquisition and transcriptional 
regulation mechanisms in prophages from Gram-negative 
and Gram-positive bacteria. 

Prophages have apparently played a decisive role in the 
emergence of 0157 as a food pathogen. This is not an isolated 
case. Salmonella typhimurium, another important food 
pathogen, possesses a variable assortment of prophages 
that apparently represent a transferable repertoire of patho¬ 
genic determinants (24). The importance of prophage genes 
for the in vivo virulence of Salmonella was demonstrated 
by inactivation studies of selected prophage genes (23). 
In addition, prophages were an important source of genetic 
diversity between two closely related Salmonella enterica 
strains showing distinct pathogenic potential (serovars 
Typhimurium and Typhi) (34). 

Outlook 

This overview can provide only a short outline of an 
exciting research area at the interface between phage 
and bacterial genomics, evolutionary biology, and medical 
microbiology. From the few systems presented here it 
becomes clear that the interaction between phages and 
bacteria is not a simple arms race between a parasite and 
its host. Phage DNA also contributes to the fitness of 
the bacterial cell. Phages are apparently used by bacterial 
cells for the rapid acquisition and ecological testing of 
nonessential “phage” genes that were themselves probably 
acquired from other bacterial genomes during rare 
passage of the phage in a heterologous host. This leads 
to provocative questions. Are phages major drivers of 
the evolution of bacterial pathogens? Can the variable clini¬ 
cal potential of some protean bacterial pathogens such as 
S. pyogenes, E. coli, or Salmonella spp. be interpreted by 
possession of a specific prophage set? Many bacterial 
pathogens show a very dynamic epidemiology. Is the replace¬ 
ment of older by newer epidemic strains linked to the 
acquisition of new prophages or prophage combinations? 
Prophage genes are apparently not a silent cargo to the 
bacterial lysogen. Upon changed cultivation conditions, 
prophage genes frequently represent prominently upregu- 
lated genes. This was, for example, observed in S. pyogenes 
during temperature shift experiments (42) that mimicked 
the transition from mucosa-associated bacteria to bacteria 
in the bloodstream during acute disease. Other examples 
are the upregulation of prophage gene transcription in 
pathogenic E. coli strains when laboratory-grown bacteria 
were compared with bacteria during acute infection in 
an animal (20). Filamentous prophage belonged to the 


prominent genes showing changes in gene expression 
when Pseudomonas aeruginosa strains were compared in 
the transition from planktonic to biofilm growth (50). 
Finally, some phage genes might “cross-talk” with mamma¬ 
lian genes. For example, some lysogenic conversion 
genes from S. pyogenes prophage genes are only expressed 
when the bacterial cell comes into contact with target 
cells of the mammalian host (6). The latter excrete a low 
molecular weight factor that induces the transcription of 
the prophage gene. It is increasingly becoming clear that 
pathogens, commensals, and symbionts represent different 
positions on a continuum of bacteria-host interactions. An 
indication is the observation that both gut commensals and 
pathogenic bacteria are under the selection pressure of the 
mammalian immune system (30, 32). One might therefore 
suspect that prophages may also play an important role 
in the ecological adaptation of bacterial commensals and 
symbionts. 

Note added in proof 

An update of the literature can be found in: Canchaya, C., 
G. Fournous, and H. Briissow. 2004. The impact of pro¬ 
phages on bacterial chromosomes. Mol. Microbiol. 53: 9-18; 
Briissow, H., C. Canchaya, and W.-D. Hardt. 2004. Phages 
and the evolution of bacterial pathogens: from genomic 
rearrangements to lysogenic conversion. Microbiol. Mol. 
Biol. Rev. 68: 560-602. 
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Evolution of Tailed Phages: Insights from 
Comparative Phage Genomics 

HARALD BRUSSOW AND FRANK DESIERE 


M any phage researchers believe that phages are as old 
as their bacterial hosts. If this hypothesis is true, then 
we have to postulate elements of vertical evolution for 
phages. In view of the postulated antiquity of these relation¬ 
ships we might not expect sequence similarities between 
more distantly related phages. Comparative phage genomics 
can reveal DNA sequence, protein sequence, or gene map 
similarities in the absence of sequence similarities between 
increasingly diverged phages. For even more distant relation¬ 
ships, data from structural biology can be informative. 
Recent genomics-based ideas on phage evolution are domi¬ 
nated by two interpretations. In one view all double- 
stranded DNA tailed phage genomes are mosaics with 
access, by horizontal exchange, to a large common genetic 
pool, but in which access to the gene pool is not uniform for 
all phages (22). In this hypothesis horizontal gene transfer 
dominates over vertical evolution. In fact, it is in some way 
an updated version of the classical modular theory of phage 
evolution developed 30 years ago on the basis of heterodu¬ 
plex mapping with lambdoid coliphages (2, 7). Other investi¬ 
gators studying phages from dairy bacteria observed strong 
elements of vertical evolution in the structural gene cluster 
of phages that were not erased by horizontal gene transfer 
events (5). The two hypotheses are not mutually exclusive 
since they only set a different balance for horizontal 
and vertical elements in phage evolution. In the following 
we will present the basic observations leading to the 
two hypotheses, search for a synthesis of both concepts, 
and challenge this unifying view with recent phage 
sequence data. 

Comparative Genomics of Streptococcus 
thermophilus Phages 

Six Streptococcus thermophilus (St) phages were completely 
sequenced. As revealed by comparative genomics the 
population genetics of St phages is relatively simple (5) 


(figure 4-1). The hypothetical St phage genome can be subdi¬ 
vided into four large segments, each with its own mechan¬ 
isms for creating diversity. One segment is the late gene 
cluster extending from the DNA packaging genes to the tail 
genes. This module is represented by two unrelated config¬ 
urations: one is characteristic for cos-site phages (prototype: 
phage Sfi21), the other for poc-site phages (prototype: phage 
Sfill) (27). The two structural gene clusters are not related to 
each other at the nucleotide or protein sequence level. Both 
clusters diversify by the accumulation of point mutations 
(28). Pair-wise comparisons within each cluster revealed on 
average 10-20% bp differences. 

A second segment covers the putative tail fiber, lysis, and 
lysogeny genes. Diversity is created by insertion, deletion, 
and replacement of DNA segments and to a lesser degree by 
point mutations (11). In the lysogeny module, recombination 
processes apparently underlie the acquisition of different 
types of superinfection immunity and repressor binding 
specificity in the genetic switch region. When the lysogeny 
modules from two temperate St phages were aligned, an 
alternation of conserved and variable DNA segments was 
observed (37). Some transition zones were exactly at gene 
borders. Others were in the middle of genes separating 
protein domains. Deletions in spontaneous or repressor- 
selected phage mutants were located conspicuously close to 
these transition zones. Lytic phages, which dominate the St 
phage population, are apparently derived from temperate 
phages by a combination of rearrangement and deletion 
events in the lysogeny module (27). Recombination appar¬ 
ently also plays a role in the creation of diversity in the puta¬ 
tive tail fiber genes. In the phage protein that probably 
interacts with the phage receptor on the bacterial cell, vari¬ 
able and conserved DNA segments alternated (27). Sponta¬ 
neous deletions were observed that started and ended in 
DNA repeats encoding collagen-like protein motifs (11). 
Phages that differed in host range showed completely unre¬ 
lated variable regions, while phages with overlapping host 
ranges shared highly related variable regions. Swapping of 
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Figure 4-1 Dot-plot matrix calculated for the genome seguences of the S. thermophilus phages. Phages include Sfi 19, Sfi 11, 
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variable domains between St phages resulted in correspond¬ 
ing host range changes in the recombinant phage (18). 

The third genome segment was the putative DNA replica¬ 
tion module represented by two distinct gene constellations: 
the Sfi21-like and the 7201-like DNA replication module. The 
Sfi21-like DNA replication module is present in the vast 
majority of isolated phages and is unusually conserved (13). 
Even at the third codon position less than 1 % sequence diver¬ 
sity was observed in independent phage isolates, suggesting 
that this module was recently spread horizontally between 
St phages. The fourth St phage genome segment covers the 
rightmost 5 kb of the genome. This region gives rise to early 
transcripts encoding a protein necessary for middle and late 
transcription (46). Diversity is created by insertion/deletion 
processes while the DNA sequence is highly conserved (27). 

Graded Relatedness in Sfi21 -like 

Siphoviridae 

Graded relatedness is the hallmark of evolving systems. 
Comparison of phages infecting the same host species will 


not yield much evidence for vertical phage evolution. 
Therefore, comparisons were made between temperate 
Siphoviridae infecting distinct, but evolutionarily related 
genera of host bacteria. Complete genome sequences are 
now available for many Sfi21-like cos-site Siphoviridae that 
infect several distinct genera of low G-C content, Gram¬ 
positive bacteria (T4). The comparison of these genomes led 
to two interesting observations (figure 4-2). First, a gradient 
of relatedness was detected between these phages. Second, 
when projected on St phages the relatedness of the phages 
reflected approximately the phylogenetic relationships 
of their bacterial hosts, suggesting some coevolution of 
phages with their host bacteria. The closest relatives of 
Sfi21 were other cos-site S. thermophilus phages, which 
shared more than 80% DNA sequence identity with Sfi21. 
At the next level of relatedness is the Lactococcus phage 
BK5-T, which shared 60% DNA identity over the DNA 
packaging and head morphogenesis genes with Sfi21. At 
the protein level, the similarity between the two phages 
extended essentially over the entire morphogenesis 
module (T4). The next most closely related phage to Sfi21 is 
Lactobacillus phage adh. Sfi2T and adh are linked by about 
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Figure 4-2 Alignment of the genetic maps of phages from diverse bacterial lineages. Included are Lactobacillus phage adh, 
Streptococcus phage Sfi21, Lactococcus phage BK5-T, Staphylococcus phage PVL, and Bacillus phage c)>105. Corresponding 
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reading frames were numbered for an eaiser orientation with the original publications. See the bacteriophages.org/ 
frames_0040.htm for a color version of this figure. 


40% amino acid identity over the DNA packaging, head and 
tail morphogenesis genes. DNA similarity was not detected. 
Lower levels of protein sequence identity were detected 
between individual proteins of phage Sfi21 and Bacillus 
phage phi-105 or Staphylococcus phage PVL. However, the 
idea of phage-bacterium co-evolution received a blow 
when Sfi21-like phages sharing DNA sequence, protein 
sequence, or only gene map similarity were observed in a 
single bacterial species (41). 

Sfil 1 -like Siphoviridae 

Sfill-like pnc-site phages were also not limited to S. thermo- 
philus. Phages with extensive protein sequence similarity to 
the structural genes of Sfill were detected in a number 
of pac-site phages from low G-C content. Gram-positive 
bacteria. This series covers, in decreasing order of related¬ 
ness, prophages from S. pyogenes, and phages from Lactococ¬ 
cus (TP901-1, Tuc2009), Lactobacillus (phigle), Bacillus 
(SPP1), and Listeria (A118) (15). Nucleotide sequence 
similarity between pnc-site St phages infecting the same 
host species was high and extended over large regions of 
the genome. DNA sequence similarity was also detected 
between S. thennophilus and S. pyogenes phages. However, 
the degree of similarity was lower and restricted to part of 
the structural genes. At the protein level, a complex 
but extensive network linked pac-site phage genomes 
from different species (15). Over the structural gene cluster 


the Sfill-like phages showed an almost identical gene map. 
Only phage SPP1 differed from the other phages by the inser¬ 
tion of supplementary genes at two genome positions. The 
Sfill-like phages differed from the Sfi21-like phages by 
the possession of two major head proteins instead of one, 
the possession of a scaffold protein and the lack of proteolytic 
processing of the major head protein. 

X Supergroup of Siphoviridae 

Genomic similarities are not limited to phages from the lactic 
acid bacteria and the Bacillus branch of Gram-positive 
bacteria. Over the DNA packaging, head, and tail modules 
Sfi21-like phages showed clear similarities with E. coli 
phage HK97 (5,12) or the Pseudomonas phage D3 (14). Over 
this genome region not only did these phages show nearly 
identical gene maps but they were also linked by sequence 
similarities over several proteins. The major head proteins 
of Sfi21 and HK97 lacked any sequence similarity, but 
showed identical secondary structure predictions and 
proteolytic processing at identical amino acid positions (12). 
Interestingly, the gene map of Sfill-like phages resembled 
that of phage X as closely as Sfi21-like phages resembled 
E. coli phage HK97. The significance of these attributions 
was supported by weak sequence similarities between 
phage X and Sfill-like phages over the major head protein 
(15). This observation suggests that the Sfi21/HK97-like and 
Sfill/ 7-like head modules represent different lineages of 
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Figure 4-3 Comparison of the genomes from the established and proposed genera of Siphoviridae constituting the 
X supergroup of phages. Corresponding genes are indicated with the same shading indicated at the top. The following 
phages were included (from top to bottom): Archaeavirus \|/M2, coliphage HK97, Streptococcus phage Sfi21, coliphage X, 
Streptococcus phage Sfill, Streptomyces phage <j)C31, Lactococcus phage ski, Mycobacteria phages L5 and TM4. Selected 
genes are indicated to allow an easier orientation with the database entries. See the bacteriophages.org/frames_0040.htm 
for a color version of this figure. 


head modules in Siphoviridae and have not evolved in either 
of these two bacterial species. An analysis of the phage 
genome organization led to the definition of a X supergroup 
of Siphoviridae (figure 4-3). Notably, the similarities do not 
cover all Siphoviridae:Tl-,T5- and SP-Iike phages are clearly 
distinct from the X supergroup. Both observations demon¬ 
strate that the mainly morphologically defined taxo- 
nomical phage groups do not represent monophyletic 
groups. Further long-range relationships were detected. For 
example, Sfi2T-like phages differed clearly from lambdoid 
coliphages in the organization of the lysogeny module, 
while they shared a number of characteristics with the 
lysogeny module from P2-like Myoviridae from Gram¬ 
negative bacteria (29). Surprisingly, tailed phages are 
found in one branch of the Archaea. With respect to the orga¬ 
nization of their structural gene cluster, these viruses are 
clearly members of the X supergroup of phages. Notably, 
archaeaphages showed over their portal and major head 
proteins clear similarity with a P2-like Myovirus (5) and 
shared related nonstructural genes with a Lactobacillus 
prophage of the SfilT series. 


Lambdoid Coliphages 

The basic ideas on phage evolution were developed for lamb¬ 
doid coliphages in the pre-genomics era (2, 7). According to 
the modular theory of phage evolution, the product and unit 
of phage evolution is not a given virus but a family of inter¬ 
changeable genetic elements (modules), each of which is 
multigenic and can be considered as a functional unit. 
Homologous functions can be fulfilled by a number of 
distinct DNA segments that lack any sequence similarity. 
Exchange of a given module for another occurs by recombi¬ 
nation among phages belonging to an interbreeding phage 
population. The theory was developed 30 years ago on the 
basis of heteroduplex mapping between lambdoid phages 
(2). This method allowed the distinction of IT modules in 
lambdoid coliphages, each represented by a number of 
alleles. Some modules (e.g., head gene cluster) covered a 
large DNA segment, were homogeneous, and were repre¬ 
sented by only a few distinct alleles. Other modules 
comprised a small genome region and were represented by 
many allelic forms (e.g., late gene control) or showed further 
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subdivision into smaller subunits (e.g., early gene control). 
Comparative genomics of four sequenced lambdoid coli- 
phages (X, HK022, HK097, N15) identified a nearly perfect 
colinear gene map for their structural genes (10, 23, 42). 
Exceptions were small inserts of genes that are under inde¬ 
pendent transcription control (“morons”). Phages HK097 
and HK022 shared extensive DNA sequence identity in a 
mosaic-like fashion (23). Also phages X and N15 showed 
DNA sequence-related structural genes (42). However, the 
HK097 and X groups of lambdoid coliphages were not even 
linked by protein sequence similarity, suggesting that they 
represent distinct evolutionary lineages of phage structural 
modules (“Sfi21-/HK097-like” and “Sfill-/A,-Iike” lineages) 
both derived from a very ancient shared ancestor module 
that has been diversified beyond sequence conservation. 
Interestingly, Pseudomonas phage D3 shared a similar struc¬ 
tural gene map and much protein, but no DNA, sequence 
similarity with HK97 and HK22 (25). We interpret D3 as a 
more distant relative of HK97 within the Sfi21-like phage 
lineage. D3 is an interesting evolutionary linker since 
several of its head proteins were sequence-related to phages 
from Gram-positive bacteria. In addition, the location of the 
D3 lysin gene between tail genes and integrase is typical for 
temperate phages of low G-C content, Gram-positive 
bacteria. 

The right arm of the lambdoid coliphages encoded 
nonstructural genes. Over this region phages X, HK022, and 
HK097 showed a comparable gene organization and shared 
DNA sequence identity in a patchwise fashion. The fact that 
phage N15 deviated substantially from this gene map is 
explained by its peculiar life-style (42): N15 persists in the 
lysogenic cell as a low-copy linear plasmid with closed hair¬ 
pin telomeres. 

Mechanisms of Recombination 

Comparative genomics data can constrain hypotheses on the 
mechanism of modular exchange reactions. Transition 
zones from homology to heterology in lambdoid phages 
were mostly in intergenic regions. As in dairy phages, intra¬ 
genic transition zones frequently separated protein domains. 
One report (10) showed the presence of some short regions of 
sequence homology between distinct nonstructural gene 
modules in lambdoid coliphages. These linker sequences 
could promote genetic reassortment (modular exchanges) 
through homologous or site-specific recombination. In an 
alternative model nonhomologous recombination occurs 
indiscriminately and pervasively across the genome of lamb¬ 
doid phages, followed by stringent selection for functional 
phages (23). The second part of that process eliminates prac¬ 
tically all products of nonhomologous recombination within 
coding regions, leaving the gene-boundary recombinants 
and thereby giving the overall process an undeserved 
appearance of order and purpose. Extensive genome 


comparisons in mycobacteriophages support this alterna¬ 
tive model (40). 

Homologous recombination can occur every time a phage 
infects a cell carrying a prophage with appropriate DNA 
homologies. These events serve not to create new mosaic 
boundaries but to rapidly reassort existing gene modules 
within the population (6). The sequence analysis of two 
E. coli 0157 strains underlined the importance of this 
process. Numerous nearly complete prophage sequences 
were detected in both 0157 strains and many shared long 
regions of DNA sequence identity over the structural gene 
cluster with phage X, one with phage Mu, but none with coli¬ 
phages HK97, HK620, P2, or P4 (38). 

The more than 20 lambdoid coliphage prophage 
sequences retrieved from the different E. coli sequencing 
projects now allow a first estimate of the genetic diversity 
within that phage group. For the head gene module, for 
example, heteroduplex mapping has distinguished four 
alleles represented by phages X, 21, <f>80, and HK97 (7). 
Three 0157 prophages (CP-933N, Sp5, Sp6) lacked DNA 
sequence similarity with X or HK97 over the structural 
module, suggesting new head gene alleles (38). It will be 
important to determine how many different head gene 
alleles can be identified in E. coli. This figure will in turn 
determine the theoretical number of permutations that can 
be achieved by modular exchanges within a given phage 
group. 

Synthesis of Data from Dairy 
and Lambdoid Phages 

Despite the fact that dairy and lambdoid phages infect 
clearly distinct bacterial hosts (Gram-positive and Gram¬ 
negative bacteria), the two phage groups resemble each 
other in their mosaic relationships, arguing for comparable 
mechanisms of gene exchange. There are arguments for 
placing both phage groups in a single supergroup (5). In 
fact, when restricting the analysis to the structural gene 
cluster, one could even argue that phage Sfi21 is related 
more closely to HK97 than the two coliphages HK97 and X 
are to each other. The same argument can be made for the 
relationship between phages Sfill and X. Apparently, modu¬ 
lar exchanges between the structural gene clusters from 
different lambdoid coliphages were not sufficiently intensive 
to prevent the detection of these relationships. It has been 
proposed that the intensive protein-protein interaction 
during virion morphogenesis has impeded over-intensive 
gene exchanges (8). It is less clear what selective force has 
maintained the conserved gene order after the postulated 
split of the ancestral module into the two Sf21/HK97 and 
Sfill/X module lineages. 

The nonstructural genes evolved according to other laws. 
Modular exchanges are much more intensive, resulting in an 
important reshuffling of genes. It is also clear that distinct 
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structural and nonstructural phage gene clusters are rela¬ 
tively free to combine. For example, nonstructural gene clus¬ 
ters that shared DNA sequence identity were associated in 
dairy phages with three distinct structural gene clusters 
(Lactococcus lactis phages BK5-T, Tuc2009, and rTt) (4, T4). 
Similar observations were made in Gram-negative bacteria 
(phages HK97, A, P22). In fact, the chimeric origin of phage 
A, is still apparent from its G-C content, analysis of which 
showed a clear separation of structural from nonstructural 
gene clusters. When we speak of phage evolution, we should 
keep in mind that this term cannot be applied to the entire 
phage genome (which is a patchwork of different DNA 
segments with distinct evolutionary histories), but has 
meaning only for an individual DNA module. However, one 
can be confident that at least an outline of evolutionary 
analysis can be made for the structural gene cluster (expres¬ 
sively excluding the tail fiber genes) with a substantially 
increased phage database. The comparative genomics of the 
structural gene cluster may also provide a rational basis for a 
revised phage taxonomy that reflects natural relationships. 


Extension to Other Tailed phages? 

The phage database is currently still too small to allow defi¬ 
nitive conclusions on the mechanisms of diversification 
and evolution in other phage systems. In particular, claims 
on a common ancestry for tailed phages (I, 31) or their 


available data allow the observation of some trends. First, 
independent isolates derived from the same or related host 
species frequently shared DNA sequence identity over essen¬ 
tially the entire genome. Second, this overall alignment was 
in general interrupted by numerous insertions/deletions and 
replacements of single genes or small groups of genes. Third, 
graded relatedness was observed when phages were 
compared that infected increasingly distant bacterial hosts. 
In the following, we illustrate these generalizations by exam¬ 
ples of phage DNA sequence alignments with the dotter 
program (figure 4-4). 

Siphoviridae 

L5- and v|/Ml-like phages infecting Mycobacteria and 
Archaea, respectively, are established genera in the Siphovir¬ 
idae phage family. Both showed a more distant, but still 
detectable similarity in their genome organization with 
lambdoid phages (5) (figure 4-3). Within the latter group, the 
two archaeaphages v|/M2 and v|/MIOO differed mainly by one 
insertion and by sequence diversification over the tail fiber 
genes (30) (figure 4-4A). In the former group, mycobacterio- 
phages L5 and D29 were well aligned over the entire 
DNA sequence (19) and differed mainly by a 4kb-long 
deletion in D29 (figure 4-4B). The alignment was punctu¬ 
ated by numerous but small insertions/deletions and gene 
replacements. Larger modular exchanges, in contrast, were 
not observed. In mycobacteriophages graded relatedness 
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Figure 4-4 Dot-plot analysis of phage couples. Couples include: A: Archaeaphages \|/M2 and v|/M100; B: Mycobacteriophages 
L5 and D29; C: Salmonella phage P22 and E. coli phage HK620; D: E. coli phage T7 and Yersinia phage c|)Ye03-12; E: Bacillus 
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mycobacteriophages revealed another member of this group, 
phage Bxz2. However, new groups constituted by related 
pairs of phage isolates (Che8/Che9d and Omega/CjwT) or 
unclassified individual phages were also identified (40). 

c2-like Lactococcus phages (26), another established 
genus in Siphoviridae, differed substantially from the X 
paradigm of structural gene organization, but they 
could still be aligned with the genome map of the X-like 
Lactococcus phage ski (9) (figure 4-3) when allowing for 
two genome rearrangement events (4). The two groups of 
phages still shared sequence relatedness for several predic¬ 
ted proteins (4). 

In the DNA alignment Lactococcus phages c2 and bIL67 
differed mainly over the putative tail fiber region (13, 25) 
(figure 4-4H). Also, Lactococcus phages ski and bIL170 
could be aligned over the entire DNA sequence and differed 
mainly by the insertion of several endonuclease genes (see 
dotter display in 14). 

Podoviridae 

Taxonomists classify Salmonella phage P22 as a podovirus 
(phages with short, noncontractile tails). P22 demonstrated 
a graded relationship with Podoviridae, infecting increas¬ 
ingly distant hosts. With E. coli phage HK620, P22 still 
shared DNA sequence similarity over large genome regions 
(figure 4-4C), while with phage APSE-1 infecting a member 
of the Proteus group most of the similarity was limited to the 
protein sequence level (10, 47, 48). Interestingly, the DNA 
packaging and head genes from P22-like phages showed a 
similar genome organization to phage X; however, protein 
sequence similarity was no longer detected. 

T 7-like phages are another genus in the Podoviridae 
family. E. coli phage T7 can be aligned with Yersinia phage 
4>Ye03-12 at the DNA sequence level over essentially the 
entire genome if one allows for numerous isolated gene 
replacements (figure 4-4D) (39). In contrast, roseophage 
SIOl infecting an evolutionary relative of Rhodobacter 
(a Proteobacteria: E. coli and Yersinia both belong to the 
y-subclass) showed a similar genome organization, and 
shared protein sequence relatedness but no DNA sequence 
relatedness withT7 (44). In the 4>29-like genus of Podoviri¬ 
dae, the Bacillus subtilis phages GA-1 and PZA could be 
aligned in a mosaic fashion over most of the genome with 
one DNA inversion (figure 4-4E). With the Streptococcus 
phage CP-1, PZA shared DNA sequence similarity only over 
this inversion. 

Myoviridae 

The same pattern of overall DNA sequence alignment and 
graded relatedness was also observed in phages with a 
contractile tail ( Myoviridae ). E. coli phage Mu (representative 
of the Mu-like genus in the Myoviridae family) and E. coli 
0157 Sakai prophage Spl8 could be aligned over about half 


of the genome length at the DNA level in a mosaic fashion 
(figure 4-4F). Mu shared with the Shewanella prophage 
MuSo2 protein sequence similarity, but only overall 
genome organization in the absence of sequence similarity 
when compared with a Deinococcus prophage (36). 

Due to a larger data set, a more detailed picture can 
already be painted for the P2-like genus of Myoviridae. E. 
coli phages P2 and 186 shared DNA sequence similarity 
over the entire structural gene cluster except for two extra 
genes in P2 (one is a lysogenic conversion gene) (17) 
(figure 4-4G). In comparison, Haemophilus phage HP1 
shared with P2 sequence-related head proteins but no 
DNA sequence similarity (16). Vibriophage K139 lacked 
any sequence similarity with P2 in spite of a conserved 
genome map. 

Also for theT4-Iike genus of Myoviridae, complete phage 
genome sequences are slowly accumulating. Early heterodu¬ 
plex analysis of the T-even phages T2, T4, and T6 demon¬ 
strated at least 10% DNA sequence divergence between 
T-even coliphages (24). The heterologous DNA sequences 
are present as blocks ranging in size from 200 bp to 3 kb. 
Sequence analysis of selected conserved phage genes such 
as the capsid gene demonstrated gradients of relatedness 
leading to the definition of T-evens, Pseudo, Schizo, and Exo 
T-evens (21, 46). For gp23 this grouping represented protein 
sequence similarities greater than 90%, 60%, 50%, and 
30%, respectively. Different genes led to similar phylogenetic 
trees, suggesting that in T4-like phages vertical evolution 
dominated over horizontal evolution, again with the notable 
exception of the tail fiber genes (20). Complete sequence data 
are now available for T4 and the T-even phage RB69 and the 
Pseudo T-even phage RB49 (16, 50). Their genomes are 
largely colinear with respect to the gene map fromT4. Dot- 
plots revealed again a gradient of relatedness. T4 and the 
T-even phage RB49 could be aligned over nearly the entire 
genome length at the DNA sequence level. The alignment 
was, however, interrupted by many small and a few larger 
gaps (data not shown). One of the larger gaps was around 
the genes encoding the DNA modifying enzymes of T4. In 
contrast, T4 and the Pseudo-T-even phage RB49 shared a 
much lower level of DNA sequence identity and a straight 
line in their genome alignment was only evident for 
the structural genes (data not shown). The three T4-like 
coliphages differ from each other by insertion/deletion or 
replacements of small groups of genes. At the protein level, 
RB69 and RB49 had homologues of all essential T4 replica¬ 
tion genes, but their sequences diverged considerably from 
their T4 homologues. In contrast, many of the nonessential 
T4 genes are absent from RB69 and RB49 and have been 
replaced by unknown sequences (16, 50). Remarkably, the 
grouping of theT4-like phages also reflected an increasingly 
distant relationship of the host bacteria: T-evens were 
isolated from Enterobacteriaceae, Pseudo T-evens covered a 
larger range of y Proteobacteria, Schizo T-evens reached 
into a Proteobacteria, and Exo T-evens, sharing still a low 
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level protein sequence identity with T4, were isolated from 
Cyanobacteria (21, 46). The Schizo-T-even vibriophage 
KVP40 shared 36% of theT4 open reading frames, demon¬ 
strating 30-60% amino acid identity over the DNA replica¬ 
tion, recombination, and repair as well as viral capsid and 
tail genes. However, its genome is substantially larger than 
that of T4 (240 vs. 170 kb) and 65% of its open reading 
frames lacked any database matches (35). The conservation 
of sequence relatedness inT4-Iike phage genomes over such 
large phylogenetic distances of host bacteria is fascinat¬ 
ing. X-like phages infecting y Proteobacteria and low G-C 
content, Gram-positive bacteria have lost nearly all sequence 
similarity and maintained only a related gene map for the 
structural genes. 

Troubles Ahead? 

The comprehensive sequencing work of the Pittsburgh 
Phage Institute revealed substantial sequence diversity 
and many new types of genome organization in myco- 
bacteriophages (39). Similarly, Pseudomonas phage phiKZ, 
a myovirus with a 280 kb genome, lacked database matches 
for most of its genes (34). There is thus still substantial 
terra incognita in the sequence space of tailed phages. 
This observation is also underlined by random sequenc¬ 
ing of uncultured virus DNA from seawater: 100 liter 
samples yielded estimates of up to 7000 different viral 
sequences (3). Statistical models predicted perhaps a 
billion different phage-related open reading frames in the 
oceans (43). 

Outlook 

We are currently in a transition period. An analysis of 
200 complete phage genome sequences in the database 
has allowed us to perceive some basic principles of phage 
genome diversification. On a more ambitious scale with 
several thousand complete phage genomes from a wide 
evolutionary range of host bacteria, we might arrive at a 
sequence-based theory of phage evolution and first 
insights into the relationship of phages to eukaryotic 
viruses and possible links of phage genomes to the 
universal tree of life. Alternatively, we might realize that 
phages are so diverse with respect to their gene content 
that the data challenge our current taxonomic systems 
of phage classification and hypotheses on phage evolu¬ 
tion. Whatever the outcome, the ease and relatively low 
cost of phage sequencing, combined with the extensive 
knowledge on model phages such as X and T4, could 
give phage genomics a lead role in population genetics, 
the evolution of simple DNA genomes, and the modeling 
of a realistic DNA sequence space. 
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Phage Ecology 

STEPHEN T. ABEDON 


P hage ecology is the study of the interactions between 
phage and their environments. These interactions are 
consequential, particularly to the extent that they affect 
bacteria. During traditional molecular phage character¬ 
ization, however, only minimal consideration of ecology 
is made. Contrasting this tendency, here I consider phage 
organismal, population, community, and ecosystem ecology 
(Table 5-1). For additional approaches to the review of phage 
ecology, as well as the related field of phage environmental 
microbiology, see (7, 8,10,18, 20, 26, 39, 41, 49, 53) together 
with various reviews of aquatic and ecosystem phage ecol¬ 
ogy (38, 56, 59, 68, 69, 78). Visit www.phage.org for additional 
phage ecology resources. 

Phage Organismal Ecology 

The Basic Phage Life Cycle 

While underlain by copious variety and detail, the general 
phage life cycle involves adsorption, infection, and release, 
together with a consideration of phage decay (figure 5-1). 
Adsorption may be further divided into a diffusion-mediated 
extracellular search, collision between phage and bacte¬ 
rium, attachment between phage and bacterium, and 
nucleic acid uptake into the bacterial cytoplasm. Infection 
may be divided into an eclipse period and a period of phage- 
progeny maturation. The eclipse period is either prevegeta- 
tive in the sense of immediately proceeding phage-progeny 
maturation, or is temporarily or greatly extended, as 
observed, respectively, with pseudolysogeny and lysogeny. 
Release can occur by various mechanisms, depending on 
the phage, including via lysis, via extrusion, or via budding. 
Failure can occur during most or all of these steps, resulting 
in virion or infection inactivation. 

The study of these processes—especially from the per¬ 
spective of in situ costs, benefits, constraints, expression, 
and per-infection productivity—is the province of the phage 
organismal ecologist. More broadly, one can view virus 
organismal ecology as the study of the adaptations viruses 


employ to overcome physical, chemical, or biological 
barriers to transmission between hosts (46). In this section 
I briefly introduce the phage life cycle as viewed from an 
ecological perspective. 

Phage Adsorption 

Phage adsorption begins after phage release from infected 
cells and ends with the uptake of phage genomes into 
the cytoplasms of permissive bacterial hosts. The more 
rapidly phage adsorb, the greater their likelihood of avoiding 
decay (21, 37, 58) and the shorter their overall life cycle (5). 
Phage mutants displaying faster adsorption than their 
parental wild type have been isolated from laboratory 
cultures (32), which suggests that phage adaptation to new 
hosts or new conditions can involve evolution of increased 
adsorption rapidity. This result could also occur if phage 
adsorption rates in the wild are not always maximal. 
One probable example of the latter is seen with phage T4 
which, by requiring specific adsorption cofactors, may be 
adsorption-competent within environments in which 
healthy hosts are likely (e.g., colons) but less adsorption- 
competent in environments where healthy hosts are less 
likely (e.g., the extracolonic environment; 31,49). 

Infection (the Latent Period) 

The phage latent period begins with the eclipse, during 
which mature virions are not yet associated with phage- 
infected bacteria. The post-eclipse phase of an infection can 
be understood as a period of phage-progeny maturation 
since mature virions may be released from infected bacteria 
either following artificial lysis (36) or, as is the case for 
filamentous phage (see chapter 12), without host lysis. 
Maturation, for highly virulent phage, mostly occurs at a 
constant, linear rate rather than exponentially because the 
rate of synthesis of virion components is limited by the host’s 
anatomy or physiology (42, 74). The situation is complicated, 
however, if cells are able to continue to grow and divide 
during phage infection. This is because ongoing growth of a 
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Table 5-1 Defining Phage Ecology 


Ecology 

A Bacterium is... 

Considerations 

Experiments 

Organismal 

A bacterium is a 
target, or an entity 
that impacts on 
phage phenotype 

Phage anatomy, physiology, and behavior characterized from 
an in situ or a Darwinian perspective; virion stability, 
survival, and adsorption; eclipse period, latent period, and 
burst size; adaptations overcoming barriers to transmission 
between hosts 

Single-step growth; 
adsorption curves; 
kinetics of phage 
decay 

Population 

A bacterium is an 
environmental 

resource 

Phage population growth and density; liquid versus spatially 
structured environments (broth growth versus plaque 
growth); low versus high phage multiplicity; lysogeny 
versus active phage replication; within-bacteria 
competition 

Batch growth; 
plaque growth; 
phage stock 
preparation 

Community 

A bacterium is a 
partner in 
coevolution 

Phage-host coevolution; impact of phage density on density 
of uninfected bacteria and vice versa; community stability; 
host resistance; phage host-range breadth and 
variation; transduction and phage (or lysogenic) 
conversion; interaction between different phage species 

Phage-host continuous- 
culture or serial- 
transfer experiments; 
in situ observation 
and experiment 

Ecosystem 

A bacterium is a 
lower trophic level 

Phage impact on ecosystem nutrient cycling and energy flow; 
short-circuiting of the microbial loop; nutrient release from 
eukaryote tissues due to attack by phage-encoded toxins 

In situ observation and 
experiment 



Figure 5-1 General phage life cycle. Continuous lines are 
paths that must be traversed to complete the phage cycle 
while dashed lines are optional and may be detrimental to 
phage propagation. I use the word “Decay” in its most 
general sense to describe conversion of phage virions or 
intact phage-infected bacteria from viable plaque-forming 
units, respectively, to non-infectious virions or to phage- 
infected bacteria that are unable to produce phage virions. 

phage-infected bacterium presumably can replenish those 
cell components, such as ribosomes, that can limit rates of 
macromolecule synthesis. At the same time, phage can 
negatively affect the division of infected bacteria. These 
negative effects can range from a slowing of host population 
growth as seen with filamentous phage (48,63) to a complete 
cessation of host division as seen with highly virulent phage 
(e.g., chapter 18). 

Phage Progeny Release 

Lysis involves a tradeoff between maximizing per-infection 
phage productivity and minimizing the phage generation 


time (5, 23). So long as a virus particle remains inside an 
infected bacterium then it is not free to acquire a new host. 
For most phage the release of progeny is controlled by phage 
proteins (e.g., holins) and coincides with the destruction of 
the parental infected cell (lysis; see chapter 10). For filamen¬ 
tous phage, which extrude their phage progeny across the 
host cell envelope, release does not necessarily result in 
host cell death (see chapter 12), thereby allowing filamen¬ 
tous phage to bypass, to some degree, the tradeoff between 
per-infection productivity and generation time. 

Phage Decay 

If one is willing to accept that phage are alive (e.g., 27), then 
phage decay, in its narrowest sense, is equivalent to virion 
death (or “inactivation” or “loss of titer”; 37). Phage decay 
(68, 69, 78) likely limits the impact of phage on bacteria (70) 
and also imposes important constraints on virulent phage; 
it implies that virion populations cannot survive indefinitely 
in the absence of sufficient densities of susceptible bacteria 
(e.g., 30, 76). Similarly, the evolution of lysogeny (see chapter 
7) must be dependent at least in part on the relative impor¬ 
tance of virion decay versus phage and prophage replication 
rates as, for example, Stewart and Levin (67) suggest with 
their “hard times” hypothesis (see also 39). In general, 
decay impacts directly on phage per-infection productivity 
by reducing the duration of phage-progeny survival. 

Phage Population Ecology 

Phage Population Growth 

While phage organismal ecology emphasizes virion pro¬ 
duction and survival (above) and phage community ecology 
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has an emphasis on host population dynamics (below), 
the emphasis of phage population ecology (table 5-1) is on 
phage population growth and intraspecific competition, 
either within bacterial cultures (5) or within individual 
infected bacteria (28). Like any organism living within 
a suitable environment possessing sufficient resources, a 
phage population will increase in number exponentially 
(2, 24, 34). Phage populations that increase in size most 
quickly should acquire host cells most rapidly. The acquisi¬ 
tion and exploitation of one bacterium by one phage 
means that one unit of bacterial resource may be less 
available for exploitation by a second phage. Over the 
short term, in relatively simple environments, selection 
within phage populations should therefore be for both 
more rapid phage population growth and more rapid host 
cell acquisition. 

Certain phage characteristics can contribute to this more 
rapid growth and host cell acquisition (e.g., 33), though not 
necessarily without compromise (6, 20). For instance, we 
should expect evolution to favor decay resistance, particu¬ 
larly so long as this does not require excessive virion sophis¬ 
tication, interfere with rates of phage adsorption, or compete 
with phage per-infection productivity. Similarly, within 
more fluid environments we expect selection to favor rapid 
phage adsorption, though during plaque growth the same 
expectation is not necessarily realized (as discussed below; 
see also 32). Since the eclipse period represents a delay until 
the start of phage-progeny maturation, we also expect 
selection to result in a shortening of phage eclipse periods. 
Lysogeny (as well as pseudolysogeny; 12, 55), with its 
extended delay between infection and progeny maturation, 
is presumably the product of a countering selection for 
longer rather than for shorter delays. One also expects a 
selective favoring of phage that display higher rates of 
progeny maturation, once the period of progeny maturation 
has begun, though a maximization of these rates is likely 
balanced against virion-particle sophistication as well as 
any host-maintenance costs required for continued infec¬ 
tion. Similarly, progeny release, once initiated, should pro¬ 
ceed rapidly. Below I consider in greater detail forces that 
may impact on the evolutionary optimization of the duration 
of phage latent periods. 

Evolution of Phage Latent Period 

From an ecological perspective we can divide the members 
of phage populations into two groups: prereproductive and 
reproductive. Prereproductive phage are those engaged in 
either adsorption (including the extracellular search for 
susceptible bacteria) or the eclipse: during these periods the 
phage is not generating mature phage progeny. Reproductive 
phage, by contrast, are those infecting bacteria during the 
period of phage-progeny maturation. For phage that must 
lyse their host bacteria to disseminate phage progeny, we 
may describe a period of progeny maturation as optimal in 


duration should the latent period giving rise to it result 
in maximized rates of phage population growth. Latent peri¬ 
ods that are too short result in insufficient burst sizes 
to sustain maximal phage population growth, while those 
that are too long slow phage population growth by delaying 
phage-progeny acquisition of uninfected bacteria. 

When prereproductive periods are short, this means 
that free phage can rapidly find uninfected cells and then 
rapidly gear up for intracellular progeny maturation. Such 
conditions should select for rapid infection turnover (via 
shorter latent periods) such that phage progeny acquire 
uninfected hosts before those cells are obtained by compet¬ 
ing phage. In general, then, high host densities and short 
phage eclipse periods should select for shorter phage latent 
periods (5). When prereproductive periods are long, by con¬ 
trast, the reproductive period, once begun, is more valuable, 
thereby resulting in selection for increased per-infection 
productivity. Thus, low host densities or long phage eclipse 
periods should select for larger phage burst sizes, even at 
the expense of longer phage latent periods (5). The first of 
these predictions was recently confirmed by competing a 
mutant of phage RB69 against its longer latent period wild- 
type parent (6): higher host densities selected for the shorter 
latent period despite the burst-size cost while lower host 
densities selected against the mutant. We would expect simi¬ 
lar compromises to hold for phage that release their progeny 
via extrusion, though in those systems the important 
balance would be between the kinetics of phage release and 
its impact on both infected-host replication (63) and overall 
latent-period duration (48). For these latter phage, low densi¬ 
ties of phage-susceptible bacteria have been shown to select 
for higher infected-bacteria fitness, which may be achieved 
by reducing per-bacterium rates of phage propagation (22). 
See (23) for a broad discussion of the use of phage 
latent period and population growth as models for the study 
of evolutionary optimization from a genetic-mechanism 
perspective. 

Contribution of Early Adsorbers to Phage 
Population Growth 

Phage adsorption occurs essentially as an exponential decay 
in free-phage density (figure 5-2A). For any phage cohort 
released at a given moment into a population of potential 
hosts, phage adsorption occurs such that some constant 
fraction of remaining free phage will adsorb over any given 
interval. As a consequence, more phage from a given cohort 
will adsorb during an earlier interval compared with a later 
interval. Furthermore, if by chance a phage adsorbs a host 
earlier rather than later, then the duration for which this 
phage is prereproductive will be shorter and therefore the 
total duration of that phages life cycle will also be shorter. 
The rate of phage population growth is a function of the 
duration of the phage life cycle along with the phage burst 
size. If earlier-adsorbing phage are potentially greater in 
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Figure 5-2 Exponential phage adsorption and phage population growth. A: Free-phage adsorption according to the model of 
Stent (66) with log 10 (N = per ml host density) as indicated for different curves and k (the phage adsorption constant) set 
equal to 2.5 x 1CT 9 ml/min. Adsorption curves cross the horizontal line at the average phage adsorption time (mean free 
time) = 1/fc/V. B: Free-phage mean free time graphed as a function of bacterial density. C: Log 10 phage density following 
1000 min of phage growth as simulated or calculated at different host densities (the log scale and log transformation of the 
y-axis are both intentional—the number 10 2 represents a phage density of 10 1o ° phage/ml). Assumed are a latent period of 
25 min and burst size of 75 phage/cell, together with an adsorption constant 2.5 x 1CT 9 ml/min. Simulations involve a 
modeling of exponential phage adsorption (circles), which is equivalent to the Stent-modeled adsorption curves in panel A. 
For calculations it was assumed that individual free phage adsorb after an extracellular search of 1 //dV min (squares) as 
calculated as a function of host density in panel B. D: Phage latent periods (optima) that give rise to maximal phage 
population growth as determined using simulations (meaning adsorption via exponential free-phage decline; circles) or 
calculations as described for panel C (meaning adsorption for all free-phage cohorts is mean free time in duration; squares). 
Panel D is a variation on a figure originally published in (5) and is used with permission as granted by the American Society 
for Microbiology. See (5) for discussion of methods for all panels. 


number due to the exponential kinetics of phage adsorption, 
are less susceptible to decay (above), and display shorter 
life cycles, then it stands to reason that members of phage- 
virion populations that by chance adsorb earlier will 
contribute more to phage exponential growth than later- 
adsorbing members. 

At high host densities phage adsorb relatively rapidly 
such that variance in phage prereproduction duration is not 
large. Flowever, at lower host densities the timing of the 
adsorption of the majority of a phage cohort is spread over 
much longer intervals (figure 5-2A), and the contribution 


of those phage that by chance are adsorbed by hosts earlier 
becomes increasingly large and important to overall phage 
population growth. Thus, the average timing of phage 
adsorption (the phage mean free time; see 5) declines as 
a direct function of host density (figure 5-2B), but phage 
population growth as a function of host density does not 
decline as quickly (figure 5-2C). As a result, while evolution 
ought to favor phage with longer replicative periods given 
lower host densities, optimal latent periods at lower 
host densities are not nearly as long as would be predicted 
based on phage mean free times (circles, figure 5-2D; 5). 




PHAGE ECOLOGY 41 


Phage Plaque Growth 

Phage can grow within soft agar overlays—a simple, 
spatially structured environment—as plaques punctuating 
otherwise opaque bacterial lawns. Phage growth in plaques 
may be considered to occur in four stages (47): (i) initial host 
adsorption of seeded phage, (ii) an initial round of bacterial 
infection, (iii) an “enlargement phase” that involves multiple 
rounds of adsorption, infection, and release, and (iv) the 
end of the enlargement phase, which typically is associated 
with physiological changes in the bacterial lawn. Differ¬ 
ences between phage growth in plaques versus broth occur 
throughout the enlargement phase, during which the physi¬ 
cal structure of solid media (i) slows diffusion, (ii) prevents 
gross environmental mixing, and (iii) probably gives rise 
to local phage multiplicities that are higher than those 
observed over the majority of phage population growth in 
well-mixed broth media. Phage growth within plaques addi¬ 
tionally introduces plaque size as a means by which issues of 
phage fitness may be addressed (e.g., 51, 52). 

We can imagine at least five selective pressures that 
act on phage during plaque growth: (i) at the periphery 
of plaques, which is where uninfected bacteria are found, 
there should be selection for more rapid exponential 
growth, such as for shorter phage latent periods when 
host densities are high (above): (ii) regardless of location 
within a plaque, during the plaque enlargement phase 
there should be selection for fast diffusion away from the 
plaque center such that uninfected bacteria surrounding 
the plaque may be obtained and exploited (similarly, see 
early interpretations of the classic assertion that smaller 
virions produce larger plaques than larger virions; e.g., 77); 
(iii) towards the center of plaques—where there is a low 
prevalence of uninfected bacteria—there should exist a 
countering selection for greater burst sizes even at the 
expense of longer latent periods; (iv) throughout the plaque 
there should be selection exerted by the tendency of phage 
to decay (50), including by adsorption to cell debris (61) 
or adsorption to infected cells (the latter due to super¬ 
infection exclusion; 3); and (v) there can be selection for 
maintenance of phage growth despite the physiological 
aging of the bacterial lawn (e.g., phage T7; 52). Given 
this myriad complexity, how, where, and when one deter¬ 
mines phage fitness during plaque growth becomes 
extremely important since different plaque regions may 
be under different selective pressures. These pressures 
can vary in addition with host density as well as host 
physiological state over the course of plaque development. 

As a further complication, plaque size does not neces¬ 
sarily correlate with per-infection productivity. It has been 
hypothesized, for instance, that phage displaying shorter 
latent periods, even given smaller burst sizes, could display 
larger plaques (47, 79). Longer latent periods resulting 
in smaller plaque sizes are most commonly (and classi¬ 
cally) observed among T-even phage where lysis-inhibition 


defective (r) mutants display larger plaques and conditionally 
shorter latent periods than lysis-inhibition competent, 
wild-type phage (35, 45). Recently phage RB69 mutants 
have been characterized that display larger plaques 
than wild-type RB69 along with shorter latent periods and 
smaller burst sizes (6). 

It also has been hypothesized (79) that reducing the like¬ 
lihood of host attachment given phage-host collision— 
which, along with rates of phage diffusion, is a major compo¬ 
nent of the phage adsorption constant (66)—can increase 
rates of plaque enlargement. The assumption is that with 
lower attachment efficiency phage spend less time infecting 
cells and more time diffusing outward from the periphery 
of plaques. Indeed, one explanation for why phage X lost 
its tail fiber upon domestication (44) is that reduced attach¬ 
ment efficiency resulted in the formation of inescapably 
selectable larger plaques. Sarma and Kaur (64) observed 
perhaps similar results with various host-range mutants of 
cyanophage N-l. 

Phage Community Ecology 
Community Stability 

Phage community ecology (table 5-1) emphasizes the 
bacterial host, for example the impact of phage on bacterial 
densities and the evolution of phage resistance (9,16). Phage 
community ecology also considers phage-host coevolution 
such as the propensity of phage to evolve strategies that 
counter mechanisms of host resistance. Bacteria that 
are resistant to phage attack can contribute to the stability 
of phage-containing communities by impeding bacterial 
extinction. Stability additionally refers to the range in densi¬ 
ties of host and phage populations as they oscillate over time, 
with greater oscillation amplitude (density variance) corre¬ 
sponding to lower community stability. Note that reduced 
community stability—i.e., greater bacterial population 
instability potentially resulting in local bacterial extinc¬ 
tion—is the goal of phage therapy and other means of 
phage-based biocontrol of bacterial densities (see chapter 
48 and reference 40). 

Phage community stability in the laboratory is typically 
studied within the continuous phage-bacterial cocultures 
commonly referred to as chemostats (60). A chemostat 
possesses a reservoir containing sterile media that flows to 
a well-mixed growth vessel containing microorganisms. 
Flow from reservoir to growth vessel may be controlled via 
the use of a peristaltic pump, with outflow from the growth 
vessels occurring at the same rate as inflow. Figure 5-3A 
presents a simulation of a relatively unstable chemostat. 
Note that phage have driven phage-sensitive bacteria to 
extinction after about 110 hours of chemostat progression. 
Some 100 hours later phage densities decline to extinc¬ 
tion due to virion outflow from the chemostat growth 
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Figure 5-3 Computer-simulated chemostats. A: Chemostats were simulated employing the method and parameter values of 
Bohannan and Lenski (14). Time steps here are 1 min rather than 3 min; the initial host (continuous lines) and phage (dotted 
lines) densities are 10 4 /ml and 10 5 /ml, respectively. Unless otherwise noted, the limiting nutrient is glucose that is found in 
the chemostat reservoir at a density of 0.5 mg/I. B: Shown is a chemostat eguivalent to that in panel A though with the 
phage adsorption constant reduced by one half. C: Shown is a chemostat eguivalent to that in panel A but with one half as 
much limiting glucose present within the virtual chemostat reservoir. D: Shown is a chemostat equivalent to that in panel A 
except that the phage burst size is reduced by one half. Phage and bacterial densities during simulations were sampled for 
inclusion in graphs once every 30 min. These simulated chemostats contain no phage-resistant bacteria or other bacterial 
refuges from phage attack. Extinction is assumed to occur at or below densities of 1CT 2 phage or bacteria/ml. 


chamber. Phage-host communities within chemostat, 
however, are often more stable than may be accounted 
for by phage community ecology theory (T6, 65). Below I 
consider some of the reasons why. 

Refuges 

Levin et al. (54) speculated that refuges, which shield sensi¬ 
tive bacteria from phage attack, could increase the stability 
of phage-host communities, as subsequent experiments 
have corroborated (65). In such a scheme the phage-induced 
extinction of sensitive bacteria is prevented by their hiding, 
for example, within chemostat wall populations. Through 
cell division these hosts can supply phage-sensitive bacteria 
to the liquid (unrefuged) phase of the chemostat. Once phage 
populations have declined due to their outflow from the 
chemostat, the liquid-phase host population can then 


grow back to population densities present prior to phage 
attack. 

Slowed Adsorption 

Bohannan and Lenski (15) describe bacteria that have 
entered a “genetic” refuge (phage-resistant mutants) as 
“invulnerable prey.” However, unless a phage’s collision with 
a bacterium results in some degree of phage-host attach¬ 
ment, then a resistant bacterium is not potential prey but 
instead some relatively inert component of the environment 
from which phage “bounce.” Wilkinson (75), on the other 
hand, has suggested a model in which completely resistant 
bacteria really are invulnerable “prey.” Here the assumption 
is that the “predator'species (in this case a Bdellovibrio rather 
than a phage) may reversibly interact with non-prey bacteria 
by pausing following collision. This delay in detachment 
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extends the Bdellovibrio’s extracellular search. From the 
perspective of susceptible bacteria, this delay is equivalent 
to a reduction in effective predator density. Wilkinsons 
conclusion upon modeling such a system is that the pres¬ 
ence of invulnerable-prey bacteria, even in the absence of 
metabolic competition with vulnerable-prey bacteria, will 
result in an increase in the stability of Bdellovibrio -sensitive 
bacterial densities. 

Reductions in phage adsorption rates could similarly 
result in increased community stability. For instance, a 
partial reduction in host reception to phage adsorption 
(e.g., bacteria partially resistant to phageT2) should contrib¬ 
ute to an increase in community stability by delaying phage 
attachment to sensitive bacteria (17). Figure 5-3B presents a 
simulated chemostat for which the phage adsorption con¬ 
stant has been reduced by one half, and bacteria and phage 
extinctions are thereby avoided. Again with phage T2, there 
apparently is a tendency for these phage to be temporarily 
adsorption-inhibited (up to weeks at room temperature) 
following release from infected bacteria (62). This phenom¬ 
enon could also serve to increase community stability by 
delaying phage adsorption. By reducing phage numbers, 
mechanisms of phage decay, including outflow from chemo¬ 
stat growth chambers, should also have the effect of increas¬ 
ing community stability. Furthermore, phage evolution 
could result in a decrease in community stability by increas¬ 
ing rates of phage adsorption or by reducing rates of phage 
decay. 

Reduced Phage Productivity 

Host density impacts on community stability by affecting 
the peak phage densities that follow community-wide host 
lysis. With more phage than hosts within a batch-culture 
system, eventually all sensitive bacteria may become 
adsorbed and lysed (10, 29). However, in continuous culture 
there will be decay in free-phage densities due to outflow 
from the growth vessel. Since the rate at which host cells 
are found by free phage is a function of free-phage density 
(1), there is a race between the survival of phage-sensitive 
bacteria and free-phage outflow. The lower the peak phage 
density, the less the bacterial population will be reduced 
in size due to phage adsorption, and the greater the likeli¬ 
hood that phage adsorption will not reduce the bacterial 
population to the point of extinction. The smaller the density 
of bacteria available for infection within a chemostat, 
in turn, the lower the peak phage density. Bohannan 
and Lenski (14) demonstrated this point experimentally by 
reducing the bacterial growth potential through restric¬ 
tions in the density of a limiting nutrient (glucose) and then 
observing an increase in phage-bacterial community sta¬ 
bility. See figure 5-3C for a simulated chemostat in which 
the nutrient density in the reservoir has been reduced by 
one half, and again note that extinction of bacteria and 
phage is avoided. 


The productivity of page infections and infection 
density together determine peak phage densities. It is well 
known that phage growth parameters such as lysis timing 
and burst size can vary as a function of temperature, host 
nutrition status, phage multiplicity, and host physiology 
vis-a-vis the standard bacterial growth curve (1-4, 8, 34, 
42, 57, 65, 69, 74). If the stability of a chemostat is an inverse 
function of peak phage density (i.e., more phage = less stabi¬ 
lity), then reduced infection productivity given reduced 
nutrient availability should contribute to an increase in 
community stability. Similarly, we might expect that the T- 
even-phage lysis-inhibition phenotype (1-4) would be desta¬ 
bilizing since it contributes, particularly at higher host 
densities, to a larger phage burst size. In figure 5-3D the 
impact on community stability of reducing the phage burst 
size by one half is explored, with bacteria and phage extinc¬ 
tion yet again avoided. 

Synthesis 

It is highly likely that phage-host community stability 
arises from two relatively simple forces: (i) If bacteria 
cannot be driven to extinction by even excess phage densi¬ 
ties, for example as is at least approximated with host refuges 
from phage attack, then bacteria simply will not be driven to 
extinction by phage, (ii) If sensitive hosts can be driven 
to extinction given sufficient phage densities, then hosts 
will be driven to extinction only if sufficient phage densities 
are present within an ecosystem. There are two corollaries 
to the second point: (a) At peak phage density, the fewer 
phage found within an ecosystem, the smaller the nega¬ 
tive impact those phage will have on phage-susceptible 
bacterial populations and therefore the more stable the 
system (compare figure 5-3Awith figures 5-3C and 5-3D). 
(b) Mechanisms that interfere with a phage's attainment 
of higher peak densities (or with the phage impact on indi¬ 
vidual bacteria)—such as greater phage decay, more rapid 
outflow, partial inhibition of phage adsorption, or reduced 
phage burst sizes—can lead to an increase in the stability of 
a phage-bacterial community (figure 5-3). Indeed, as noted 
by E. S. Anderson in 1957 (p. 205: 10), ‘Anything which 
restricts the phage titre limits the selective action of phage.” 

Phage Ecosystem Ecology 

Phage ecosystem ecology (table 5-1) encompasses the biotic 
as well as the abiotic world, in particular the biogeochemical 
cycling of nutrients and the flow of energy between and 
through ecosystems, usually with an aquatic emphasis 
(for recent reviews see: 38, 56, 59, 68, 69, 78). Bacteria 
consume, produce, and store nutrients and energy, as well 
as contribute to the decomposition of other organisms. 
Phage infections contribute to a solubilization of bacteria, 
whether following host-cell lysis or via the conversion of 
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host components into virion particles. Solubilized bacteria, 
in addition to no longer functioning as consumers, produ¬ 
cers, or decomposers, are also less available as food to 
bacteria grazers: those protists or animals that obtain their 
nutrients and energy through the ingestion or engulfment of 
intact bacteria. Since bacteria are the chief consumers of the 
solubilized components of aquatic ecosystems, a major 
consequence of the phage-induced lysis of bacteria is not 
just a reduction in the productivity of bacterial populations 
but also a delay in the movement up food chains of bacteria- 
contained nutrients and energy. One can additionally con¬ 
sider the phage-encoding of bacterial toxins (see chapter 47) 
from the perspective of ecosystem ecology: toxin exposure 
can free up nutrients from the tissues of bacteria-infected 
eukaryotes (see 20 for a review of bacterial pathology from 
the perspective of phage ecology). 

In their requirement for intact bacteria, phage in a sense 
are competitors of the bacteria grazers. Due to the host- 
range constraints observed among all parasites, individual 
phage also tend to be more specialized than most grazers in 
terms of which bacteria within a community they may affect 
(e.g., 19, 71). The greater the densities of phage-susceptible 
bacteria—except to the point where bacterial physiology 
suffers—the faster phage populations can grow. The more 
phage ultimately produced, the lower the density of phage- 
susceptible bacteria after a phage attack. Presence of phage 
within an ecosystem, particularly phage displaying rela¬ 
tively narrow host ranges, consequently has the perverse 
effect of selecting against common bacterial phenotypes, 
and thereby for increased diversity in the bacterial 
community; that is, for multiple phage-susceptibility types 
(72, 78). Phage additionally can make direct, positive contri¬ 
butions to the fitness of bacteria hosts through phage 
conversion (12) or via the transduction of genes from other 
bacteria (20). Phage DNA and protein coats, following failed 
infection, might even serve as bacterial nutrients (38). 

Phage ecosystem ecology represents an elaboration on 
the various issues of organismal, population, and commu¬ 
nity ecology already considered. It follows, therefore, that 
many or all of the complications discussed throughout this 
chapter also affect our understanding of the phage impact 
on ecosystem nutrient cycling and energy flow. In addition, 
much of the impact of phage on ecosystems, though still 
incompletely characterized, has been discerned from the 
study of aquatic phage biology. The influence of phage, espe¬ 
cially at microscales, is less easily grasped in more complex 
ecosystems such as undisturbed soils (25) or in biofilms 
(43, 73). Indeed, one essential step in characterizing the 
phage impact on soil bacteria—the determination of a 
soil-phage total count (e.g., 13)—has only recently been 
accomplished (11). Thus, both literally and figuratively our 
understanding of the impact of phage on real ecosystems 
has barely scratched the surface of phage ecosystem 
ecology’s ultimate goal: quantifying the phage impact on 
nutrient cycling and energy flow throughout the biosphere. 
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I magine trying to stuff a string more than 6 pm long into a 
sphere that is fifty nanometers in diameter. The hole in 
the sphere that the string must enter is only twice as wide 
as the string itself. The string is stiff, with a persistence 
length on the order of 50 nm. It is also negatively charged 
and self-repulsive. The string must be organized such that it 
can be pulled out easily, so no knots or tangles are permitted. 
When the sphere is full, the string will have a near crystal¬ 
line density. You have several minutes to complete this task. 
This difficult feat is the challenge presented to dsDNA 
phages during DNA packaging, a pivotal event in the assem¬ 
bly cascade. 

The task of compacting the double-stranded DNA 
chromosome into a protein capsid is a dramatic endeavor. 
DNA by its nature does not want to be in condensed form, 
but rather is dispersed, occupying a volume more than 
100 times its volume inside the virion (47, 54). Therefore, in 
order to be packaged, energy must be invested in the DNA. 
The DNA packaging event must also be coordinated with 
the replication of the phage DNA that is to be packaged, 
as well as the assembly and maturation of the protein capsid. 
Numerous investigators, using a battery of model phage 
systems, have made a concerted effort over four decades to 
resolve the components and mechanism of DNA packaging. 

Descriptions of the specific components and processes 
involved in DNA packaging for many of the phages are 
described in the accompanying chapters of this book. Our 
intention here is to describe the specific challenges of 
double-stranded DNA packaging in bacteriophages and 
detail the common events and structures involved. For most 
of the systems dealt with here, an extensive battery of 
biochemical and genetic resources has accumulated over 
the past half century. Defined in vitro DNA packaging 
systems have been developed for many of the phages we 
will describe [T3 (41); T7 (89); T4 (79); X (50); <j>29 (40)]. 
This ability to manipulate DNA packaging has been the 
hardy complement to the genetic, biochemical, and micro¬ 
scopy approaches that preceded, and now parallel, the 


development of these experimental systems. More recently, 
structural data have come to the forefront of efforts to under¬ 
stand DNA packaging in the form of cryo-electron micro¬ 
scopic reconstruction of phage structures and X-ray 
crystallographic and NMR analyses of components of the 
DNA packaging machine. These advances bring additional 
relevance to the study of DNA packaging in bacteriophages 
and offer the opportunity to elucidate the mechanism of 
DNA packaging. 

Components of DNA Packaging Systems 

In order to provide an informative account of the phage DNA 
packaging process, we will first briefly review the compo¬ 
nents involved in packaging in some well-characterized 
phage systems. All these phages have a double-stranded 
DNA to be packaged; a prohead receptacle for the packaged 
DNA; and packaging ATPases, enzymes that procure the 
DNA substrate and mediate the conversion of chemical 
energy to the mechanical energy required to translocate 
the DNA into the prohead. The convergence of the matura¬ 
tion pathways and the interaction of these components 
comprises the DNA packaging event. 

DNA 

The phage DNA chromosome must retain the information to 
do three things: ensure its own replication to produce chro¬ 
mosomes to be encapsulated into progeny virions; comman¬ 
deer the host cell metabolism and redirect it toward the 
production of progeny virions; and encode the structural 
proteins and enzymes required to assemble new virions. To 
achieve the first goal, a number of strategies yield forms of 
replicated DNA that are presented as immature chromo¬ 
somes to be packaged. Virion DNA of the dsDNA phages 
is linear and is packaged processively, generally from left 
to right with respect to the conventional genetic map 
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Table 6-1 Viral DNA and DNA Replication Strategies of Various Double-Stranded DNA Phages 


Phage 

Incoming virion DNA 

Replication strategy 

End state of 
replicated DNA 

X 

Linear with 5' 12-base complementary 
ends (cohesive ends) 

Closed circle switching to rolling circle 

Linear concatamer 

P22 

Linear with 104% terminal redundancy 

Recombination with extension via direct repeats 

Linear concatamer 

<j>29 

Linear with covalently attached gp3 at 5' ends 

gp3-primed extension, strand displacement 

Unit length with gp3 
covalently attached 
at 5' ends 

T3 (T7) 

Linear with 230 (160) base pair direct repeats 

Recombination with extension via direct repeats 

Linear concatamer 

T4 

Linear with 102% terminal redundancy 

Invasive strand initiated via terminal redundancy 

Branched concatamer 

SPP1 

Linear with 104% terminal redundancy 

Unknown 

Linear concatamer 


(an exception is theT3/T7 systems, which package right to 
left). There is a teleonomic relationship between the DNA 
replication strategy of a given phage and the form of the 
linear DNA encapsulated in the virion. The key to the 
relationship lies in the replication of a linear DNA molecule 
upon infection without loss of genetic information needed 
to prime DNA synthesis at the 5' end. DNA replication strat¬ 
egies and the resulting structure of the DNA packaging 
substrates are summarized in table 6-1. 

An accessible form of DNA is the defined unit length 
chromosome produced by <f>29. Attached to each 5' end of 
the 19 kb <f>29 dsDNA is a covalently linked terminal pro¬ 
tein, gene product 3 (gp3) (68). This DNA-terminal protein 
complex, which is analogous to DNA-terminal protein 
complex in adenovirus (80), is capable of priming DNA repli¬ 
cation from each end, thus providing a straightforward 
means of overcoming the loss of information in lagging 
strand synthesis (69). The result is a mature, unit length 
chromosome that can act as a ready substrate for DNA 
packaging (6). Similar to <f>29, unit length DNA is produced 
during replication of the phage P2 genome. Unlike <j)29, 
however, P2 DNA is replicated via a closed circle mechanism 
similar to plasmid replication (4). In P2 the covalently 
closed, circular DNA (77) is processed to a linear form for 
packaging (IT, T2). Since this linear molecule must have 
the capacity to recircularize upon entry into the host cell, 
the packaging apparatus generates T9-base 5' overhangs 
that mediate circularization. DNA replication is not so 
simple in other phages, however, and the substrate chro¬ 
mosome for DNA packaging rarely appears in such an 
accessible form. 

DNA replication during infection by many well-studied 
dsDNA phages produces a substrate DNA for packaging 
that is a composite of individual genome lengths organized 
into head-tail concatemers. X circularizes its infecting 
DNA molecule via the 12 bp sticky ends. Unlike in P2, 
initial closed circle replication employing a single origin on 
the DNA is displaced by a rolling circle mechanism that 
produces DNA concatemers several genomes in length. 
Thus, to recapitulate the linear chromosome, DNA pack¬ 
aging resolves single copies of the chromosome with the 


5' overhangs from the double-stranded concatemer (see 
below). 

The linear virion DNA of many other dsDNA phage types 
is, unlike <f>29, P2, and X, longer than the length of the 
genome. In phage Mu, which integrates its DNA into the 
host cell genome, the additional DNA is of host origin, 
the result of excision of a length of DNA greater than the 
length of the integrated phage genome (43). In phages T3 
and T7, P22, SPP1, and T4, the linear virion DNA is termi¬ 
nally redundant, with a portion of the DNA sequence at 
one end of the genome being repeated at the other end of 
the DNA. This terminal redundancy permits replication 
without the loss of genetic information in that, although 
linear replication causes loss of information at the 5' ends, 
the redundancy allows the entire sequence to be recovered 
during subsequent replication (55). These phages employ a 
variety of mechanisms to generate long concatemers that 
depend upon this terminal redundancy of the chromosome, 
which in turn yield terminally redundant genomes during 
DNA packaging. In some cases (T7, T3), the sequence that 
makes up the terminal repeat is the same for all virions in 
the population. In others (T4, P22), the packaging process 
yields a population of packaged genomes that are circularly 
permuted with respect to each other and therefore have 
different terminally redundant sequences in different 
particles. 

While the exact mechanism of replication to form linear 
concatemers for phages X, P22, T3, and T7 varies (55), the 
end result is a packaging substrate consisting of a long mole¬ 
cule comprised of multiple copies of the genome from which 
a virions complement of DNA is procured during packaging. 
An additional complication is faced by phage T4, whose 
invasive strand replication initiation yields not only long 
concatemers, but ones containing numerous Holliday junc¬ 
tions that leave them highly branched (23). These convolu¬ 
tions must be resolved during packaging to yield the 
appropriate linear DNA to be translocated into the phage 
head. Phage Mu, whose DNA is integrated into the host 
genome, must excise copies of its genome from the host 
chromosome prior to, or concomitant with, DNA trans¬ 
location (43). In each of these systems a complex series of 
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enzymes and processes effect maturation of the DNA 
substrate and mediate its encapsidation. 

Packaging Enzymes 

The task of retrieving the phage DNA and processing it to a 
packagable form rests with a collection of proteins forming a 
complex often referred to as the terminase holoenzyme (16). 
This term belies the primary function of these enzyme 
complexes in many phage systems, where they perform the 
task of retrieving the unit length DNA packaging substrate 
from the long concatemers formed by the myriad DNA 
replication strategies. This definition underrepresents the 
true capacities of this group of proteins since it describes 
only one of multiple functions during packaging. In addition 
to cleaving the substrate DNA to terminate packaging and 
generate a new end, terminase complexes target the DNA to 
the waiting prohead and mediate ATP hydrolysis to power 
DNA translocation, possibly acting as the primary trans¬ 
ducer of force during translocation (see below). The desig¬ 
nation terminase does not apply to phages which package 
a preformed unit length genome, such as <f>29, where the 
enzyme is more appropriately termed the packaging ATPase. 


All known terminase holoenzyme (packaging ATPase) 
complexes function as a complex of two proteins. The classi¬ 
cal terminase combination consists of a large and a small 
protein, each with specific activities. The small subunit 
recognizes and binds to specific sequences in the substrate 
DNA in most phage systems and positions the large termi¬ 
nase subunit to cleave the DNA. Endonuclease activity 
invariably lies in the large subunit, as well as the ATPase 
activity responsible for mediating DNA translocation (42, 
70). For most systems the DNA-bound large subunit inter¬ 
acts with the prohead, and hydrolyzes ATP to power DNA 
translocation. 

Proheads 

DNA packaging culminates in the insertion of the mature 
DNA substrate procured by the terminase holoenzyme into 
a receptive prohead (figure 6-1). The icosahedral head shells 
of the dsDNA phages share the same basic architecture and 
maturation pathway (45). The major shell protein poly¬ 
merizes to form the prohead shell by associating with 
the head-tail connector and scaffold-core components. The 
dodecameric connector (or portal protein) is embedded at 





Figure 6-1 Schematic of generalized dsDNA phage assembly. A prohead interacts with the packaging ATPase 
holoenzyme-DNA complex via its head-tail connector. ATP hydrolysis powers translocation of the mature DNA, and at some 
point the scaffold core is ejected, either whole or following proteolysis. After an amount of DNA enters the head, the shell 
capsomeres rearrange, making the head more angular and, in most phages, increasing the head volume. DNA translocation 
continues until a full complement of DNA enters the head, determined either by the unit length of the DNA, sequence 
recognition of the DNA length, or a headful mechanism. The ATPase-DNA complex detaches from the connector and is 
replaced by neck, tail, and/or tail fiber components, yielding a mature, infectious virion. 
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one of the 12 vertices termed the portal vertex (93). Nomen¬ 
clature diverges in different phage systems, with the 
core-containing structure being termed prohead (A,, 4>29), 
prehead (T4), or procapsid (P22). For the sake of consistency, 
we refer to this precursor capsid shell as the prohead. 

There is a common maturation pathway of the prohead 
for most dsDNA phages that varies only in detail (figure 6-1). 
Putative prohead structures can be isolated from mutants 
lacking DNA packaging components, and they may or may 
not contain scaffolding protein. Prohead-like particles may 
be defective for packaging because they are unstable or 
immature (27), reflecting a need for synchrony of prohead 
maturation and packaging. For example, expansion of 
the capsid and concomitant thinning of its wall that is 
programmed to occur in DNA packaging may have initiated 
or occurred prematurely. The role of these structural 
maturation events has been probed in detail with respect to 
their mechanistic and temporal relationship with DNA 
packaging initiation and DNA translocation (see below). 
In general, the viable receptacle for DNA packaging is the 
unexpanded, core-containing prohead, with any proteolytic 
maturation and shell expansion occurring after DNA 
packaging initiation. 

Occupying a unique vertex of the prohead is a multifunc¬ 
tional structure called the head-tail connector or portal that 
is essential in prohead assembly and DNA packaging (96). 
The distinction between these two terms lies in the role 
this structure plays at different times in assembly. The term 
portal refers to its role in facilitating the passage of DNA into 
and out of the prohead, whereas the term connector refers to 
its role as the junction between prohead and tail. We favor 
the term connector, simply due to its preference in the 
systems with which we are most familiar (<f>29, T4). Prior to 
DNA packaging, the connector plays a role in the initiation of 
head shell formation by interacting with both the scaffold 
and head shell proteins. In phage T4, the gp20 connector 
also interacts with gp40 on the inner surface of the cell 
membrane to initiate head formation (101). During pack¬ 
aging, the connector binds the mature DNA packaging 
ATPase complex, is the portal for entry of the DNA (possibly 
playing an active role in translocation), and is involved in the 
signaling for packaging termination. Following the comple¬ 
tion of packaging the connector is the target for tail assem¬ 
bly, and in the mature virion it has a role in release of DNA 
during infection. That the connector is capable of engaging 
in each of these processes in a precise order speaks to its 
remarkable capacity not only to do many things, but to do 
them at the right time. 

DNA Packaging Processes 

In the infected cell viral DNA is recognized by the packaging 
proteins in a background of host polynucleotides. In spite of 
differences in the mechanics of DNA replication in different 


phages, as well as the persistence or absence of an intact host 
genome, there may be a single mechanism for phage DNA 
maturation for packaging that is grounded in DNA end 
formation. DNA maturation for packaging is defined as 
targeting of the resolved phage chromosome to the waiting 
prohead. The dsDNA phages select genomic DNA efficiently 
from the myriad pool of nucleic acids within the cell as 
evidenced by the high efficiency of infection by progeny, 
nearing 100% for most dsDNA phages. 

Does the DNA packaging enzyme complex pre-assemble 
and then target the mature prohead, or does it assemble on 
the prohead? Is this prohead targeting event correlated with 
prohead maturation events or with DNA replication or tran¬ 
scription? These points may be crucial in that individual 
events can be temporally or, more importantly, mechanisti¬ 
cally coupled to one another. 

Once the prohead and DNA are linked and the DNA is 
positioned for packaging, how is DNA translocated? The 
structure and mechanism of the motor and the nature of 
the chemomechanical energy conversion are the areas of 
greatest current interest and experimental focus. After the 
complement of DNA has entered the prohead, packaging is 
terminated. The unit length of the DNA packaged can be 
measured by targeting a DNA sequence to signal that the 
head contains one genome, or the amount of DNA in the 
head may feedback on the packaging machine to trigger 
termination. By compiling what is known for each phage, 
our intent is to describe a general DNA packaging mecha¬ 
nism for all dsDNA phages. However, a universal mecha¬ 
nism for DNA translocation might not exist, and caution 
is needed in comparing individual facts from disparate 
systems. 

DNA Maturation 

Maturation of the phage DNA from the cytoplasm of the 
infected cell is the first event of packaging (figure 6-2). As 
mentioned, phages such as c(>29 and P2 replicate unit 
length chromosomes. Phages X, P22, SPP1, T3 and T7, and 
T4 produce long concatemers of DNA comprised of a 
number of copies of the genome linked head to tail; unit 
length chromosomes are cut from these concatemers and 
packaged. 

DNA maturation events in phage X are quite well under¬ 
stood (17). End formation occurs at the structurally complex 
cos site, which spans 200 bp of DNA at the ends of the 
genome. The terminase holoenzyme complex of the two X 
packaging proteins, gpNul and gpA, binds cos through the 
interaction of Nul with three sequence domains, R1 through 
R3, of cosB (“binding") on the right of the cos site. The larger 
subunit, gpA, then catalyzes a single-stranded nicking 
reaction in the central cosN region (“nicking”), in the center 
of the cos site, producing 12 bp 5' overhangs. Subunit gpA is 
thought to bind as a dimer, thus permitting cutting of both 
strands on one side of the DNA helix to generate the 12 bp 
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Figure 6-2 Schematic of selected strategies for DNA maturation and packaging. Maturation ranges from formation of defined 
unit length chromosomes in X, 4)29 and PI, to terminally redundant SPP1 and T4. Mu retrieves DNA from the host genome 
upon excision. Structure of the pac site and accuracy/location of cleavage (arrows) with respect to pac varies among phages. 


overhang. Although gpA alone can bind and cut X DNA in 
vitro, apparently Nul is crucial for efficient targeting in 
vivo. Terminase binding and cleavage initiation in X also 
involves the action of IHF (integration host factor), which 
binds the region in the cos site between R1 and R2. IHF 
bends the DNA in such a way that a dimer of Nul can bind 
the juxtaposed R1 and R2 sites (24). Once DNA is cleaved by 
terminase, strand separation occurs via an ATP-dependent 
process (42), possibly to rearrange and activate the termi¬ 
nase subunits bound to the DNA for an additional, as yet 
unidentified maturation process. Terminase preference for 
the right side of the cos site is driven by gpNul binding to 
cosB to form the stable intermediate, complex I. The union 
of complex I with the X prohead yields the ternary complex 
II, which then proceeds to translocate the DNA through the 
connector and into the capsid. 

Termination of X DNA packaging is achieved by sequence 
recognition of the downstream cos site by the engaged 
packaging apparatus during DNA translocation. The 
cosQ region on the left of the downstream cos complex is 


recognized, and a cosB independent cleavage occurs at 
cosN to generate a complementary 5' overhang on the 
end of the packaged chromosome (100). cos cleavage is 
sequence-specific but also involves detection of the amount 
of DNA that has been packaged, since constructs with less 
than 78% of the normal complement of DNA between cos 
sites fail to cut normally (31). This feedback mechanism 
plays a role in other phages (see below). The termination of 
each packaging cycle from a concatemer regenerates the 
complex I, which can initiate a new translocation event. 
On average, each termination-initiation cleavage event is 
capable of priming between two and three translocation 
events without the need to generate complex I from the 
DNA concatemer de novo (32). 

The other phages requiring a cleavage of their DNA to 
terminate packaging and to generate a free end for the next 
cycle display variations of the X archetype. The terminase 
complex recognizes and binds a defined pac site on the DNA 
and catalyzes cleavage to generate an end for packaging 
compatible with requirements for DNA replication on 
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infection. The circular unit length chromosome of phage P2 
is cleaved by terminase at a pac site to generate the linear 
DNA to be packaged (11,12). Like X, P2 terminase endonu¬ 
clease generates 5' single-stranded ends, 19 bases long. P2 
DNA cleavage is coupled to prohead docking in that viable 
proheads must be present in vitro in order for the terminase 
complex to target and cut the DNA (12, 77). Phages such as 
T3 and T7, P22, SPP1, and Mu employ a strategy similar to 
that of X in that a holoenzyme complex of the two terminase 
proteins targets a pac site. Differences are found in the preci¬ 
sion, i.e., the location, of cuts made in the precursor DNA 
relative to the pac site. As in X, DNA processing accommo¬ 
dates the requirements for DNA replication upon infection. 
Phages T3 and T7 are proposed to make staggered, defined 
single-stranded cuts at the pac site, but these nicks are 230 
and 160 bp apart, respectively. It was originally proposed 
that DNA synthesis separates the strands between these 
nicks and regenerates double-stranded DNA with blunt 
ends from the single-stranded ends, ensuring the terminal 
redundancy needed for genetic competence of the progeny 
virion (97). More recently, a double-strand break mechanism 
has been proposed (34). The terminal redundancy is pre¬ 
served at the right end via a nicking and replication mecha¬ 
nism in which a displaced template is produced, followed by 
a double-stranded cut that retrieves the right end of the 
packaging genome from the concatemer. Packaging termi¬ 
nates with a double-stranded cut at the left end. An ana¬ 
logue of involvement of IHF in X packaging appears to be 
the bending of SPP1 DNA mediated by the gpl small termi¬ 
nase subunit (19). 

Phages P22 and Mu maintain a pac site that is targeted 
by their respective terminases, but cleave the DNA non- 
specifically (Mu), or semispecifically (P22) in a region of 
the adjoining DNA. In the case of Mu, whose unit length 
genome is integrated into the host cell chromosome, the 
initiation cut is upstream of the phage pac site (35). Therefore 
Mu retrieves a small portion of the host chromosome 
DNA, on the order of 56-144 bp, on the left end of the phage 
DNA to be packaged. P22 makes an initiation cleavage 
within a target area of approximately 120 bp of its pac site, 
generating a blunt end DNA capable of packaging initiation 
(3,14). 

In addition to the relative lack of fidelity in initiation 
cleavage, Mu and P22 do not terminate packaging at a 
predetermined sequence as in X or T3 and T7. Rather, these 
two phages engage in a headful packaging mechanism in 
which the sequence-independent cleavage of the DNA is 
determined by the amount of DNA packaged. Packaging of 
more than one genome length of DNA ensures replication 
competence upon infection. In Mu, host sequences to the 
right of the phage genome are packaged (20). In P22,104% 
of the genome is packaged, providing the terminal redun¬ 
dancy needed for DNA replication (13). How termination 
cleavage is triggered is unknown, but the physical force 
of the compacted DNA against some component of the 


packaging machine, either the connector or the ATPases, 
may signal that the head is full. Work on P22 and SPP1, 
which use headful packaging, has demonstrated that muta¬ 
tions in the connector affect the length of DNA packaged (15, 
94). These mutants suggest that headful packaging control 
lies in the connector but leave open the possibility that the 
terminase complex engaged in packaging could be altered 
indirectly by these mutations. In SPP1 and P22, as in X, the 
termination of the initial DNA packaging event regenerates 
the initiation complex that can target the next available 
prohead. Unlike X, however, the left ends of the DNA gener¬ 
ated from subsequent rounds of packaging are staggered 
downstream in increments of 4% (in P22) or 5.6% (in SPP1) 
of the genome length as a result of the headful mechanism 
described above. 

Phage T4 is unique in comparison with other well- 
studied dsDNA phages in that it does not have a defined, 
sequence-specific pac site. The holoenzyme complex of the 
large T4 terminase protein, gpl7, and the small terminase 
protein, gpl6, binds the hydroxymethylated T4 DNA via 
gpl6 targeting. While no particular sequence is recognized, 
initiation cleavage was recently shown to be coupled to 
recognition of single-stranded regions generated by replica¬ 
tion initiation and transcription (33). gpl7 has a domain for 
binding to single-stranded DNA, and binding is augmented 
by the smaller gpl6 terminase subunit. Therefore, while 
not sequence-specific per se, this requirement for single- 
stranded DNA suggests that initiation cleavage in T4 is not 
entirely a random event since it is coupled to sequence- 
specific processes. The large terminase subunit, gpl7, inter¬ 
acts directly with the connector, gp20 (62), and probably the 
ATPase activity of gpl7 powers both DNA cutting and 
translocation. Recently it has been shown that T4 gpl7 also 
interacts with the phage late sigma factor gp55 (67), 
implying that the gpl7 terminase subunit targeting of the 
DNA is in part directed by a cofactor similar to those seen 
in X and SPP1 (see above). 

As mentioned, T4 DNA packaging must also resolve a 
large number of branches in the substrate DNA in order 
to produce an intact linear DNA genome. Endonuclease 
VII, gp49, is responsible for much of this work, although 
gpl7 alone might be capable of resolving most branches 
since some filled heads are produced in the absence of 
gp49 (65). The gp49 works as a resolvase capable of trim¬ 
ming the branched DNA at the Holliday junctions left 
over from invasive strand-initiated replication (71). Origin¬ 
ally it was thought that gp49 acted downstream of 
the packaging complex on the prohead, but more recently 
it has been shown that this enzyme is in contact with the 
gp20 connector during packaging (36). The gp49 binds 
a discrete domain of the connector and is not in contact 
with the gpl7. 

T4 DNA packaging is terminated by a headful packaging 
mechanism, with cutting being mediated by gpl7. Isometric, 
petite heads package DNA to the same density as larger 
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prolate heads, yielding a virion chromosome 40% smaller 
than normal (30). In addition, canavanine- or head shell 
mutation-induced lollipop monster phages are capable of 
packaging large DNA molecules in the megabase range (22). 

The unit length <j>29 DNA with its associated gp3 ter¬ 
minal protein is competent for DNA packaging after repli¬ 
cation, without the need for cutting seen in other phages. 
However, a complex series of steps generates a DNA sub¬ 
strate capable of interacting with the prohead. Sequence- 
independent interaction of the terminal protein, gp3, with 
the downstream DNA helix has been described (37). This 
produces lariat loops that can form without the participation 
of other phage proteins. Only one gp3 is required, since 
lariats can form from various lengths of either right- or left- 
end fragments of the DNA generated by restriction endonu¬ 
clease digestion. Binding of the gpl6 packaging ATPase to 
gp 3 at the lariat loop junction permits the introduction of 
supercoils into the lariat. Whether this supercoiling event is 
ATP-dependent is unclear, and the mechanism by which the 
DNA is twisted has not been resolved. 

This higher order conformation of the DNA-gp3-gpl6 
seen in c(>29 likely provides efficient packaging initiation. 
It has been demonstrated that free <f>29 connectors wrap 
about 1.6 turns of DNA around their outside surface (95), 
and gel electrophoresis and electron microscopy show that 
the connector embedded in the intact prohead also binds 
and wraps supercoiled DNA (D. Anderson, unpublished 
data). This event provides a mechanism by which the end of 
the DNA is targeted to the connector portal to initiate 
packaging. Considering that the end of the mature chromo¬ 
some is a relatively small target in the busy environment 
of the cell, the long axis of the DNA is a large target for 
any complex capable of recognizing and interacting with 
the DNA. We suggest that (j)29 takes advantage of this 
by targeting the long axis of the DNA and then moving 
in one dimension to the end to initiate translocation. 
Whether nicked DNA, being torsionally unconstrained, 
can initiate packaging has not been clearly resolved. It is 
also unknown whether such DNA tertiary structure is 
employed by other dsDNA phages as a means of targeting 
the DNA packaging enzyme complex to the prohead. 

A second possible function for the wrapping of super- 
coiled DNA around the connector is to effect connector 
conformational change that converts it from a static 
organizer of shell assembly to a dynamic packaging-motor 
organelle. This idea is based on the finding that wrapping of 
supercoiled plasmid DNA around the <f>29 prohead- 
embedded connector allows the shell, previously tightly 
fixed to the connector, to be easily stripped away. The 
contour length of the DNA-connector complex that remains 
is reduced by about 120 bp, suggesting that DNA wraps 
around the connector and restrains a negative supercoil, as 
demonstrated previously for the free connector-supercoiled 
DNA complex (C. Peterson and D. Anderson, unpublished 
data; 95). 


Prohead Maturation 

A common theme in dsDNA phage head assembly is the 
maturation of the prohead from a fragile, scaffold/core- 
containing precursor to a stable mature form. The head 
shell likely polymerizes around a scaffold-connector 
complex in a relatively unstable form, which later trans¬ 
forms to a rigid and structurally robust capsid (27). This 
structural conversion can involve proteolytic cleavage of 
the scaffold and/or shell proteins and an increase in the 
prohead volume by as much as 100% (HK97) (21), a process 
called expansion. Scaffold exit may precede or occur con¬ 
currently with expansion. 

While both scaffold exit and capsid expansion (with the 
exception of (j)29, which does not expand) must occur to 
allow the full complement of DNA to enter the prohead, the 
question persists of whether they contribute directly to DNA 
translocation (see below). Considering scaffold processing, 
early models of DNA translocation focused on the 
compacted phage chromosome as an analogue of the 
condensed DNA produced by polyvalent cation-mediated 
collapse of DNA into a toroid (59). In some phages, such as 
T4, it was suggested that core cleavage might produce 
small, charged peptides capable of condensing the phage 
chromosome within the head, thus drawing the long linear 
DNA through the portal vertex (59). Studies on phages T7 
and P22 suggest that the core exit may be coupled to DNA 
packaging, since only core-containing particles are pack¬ 
aged in vitro (57, 81). In vivo observation of T4 offered addi¬ 
tional support in that only core-containing particles could 
be chased into phage during both wild-type and mutant 
infection (5,60,64). On the contrary, A, proheads can package 
DNA after core exit (46), and 4>29 proheads containing only 
about five copies of the normal complement of about 150 
copies of scaffolding protein are packaged efficiently in vitro 
(D. Anderson, unpublished data). These conflicting observa¬ 
tions have not been reconciled, and there is no current 
mechanistic model relating scaffold exit and DNA packaging. 

Is the structural transformation of the lattice in prohead 
expansion mechanistically linked to DNA translocation? 
Only unexpanded proheads of X, T7, and P22 can package 
DNA in vitro, and expansion occurs during packaging (27, 
86). This led to the suggestion that prohead expansion 
might drive DNA translocation. Decrease in ion permeabi¬ 
lity of the head shell during expansion in T7 prompted 
the hypothesis that the DNA might be sucked into the 
sealed prohead by the hydrostatic pressure created during 
expansion (83). 

However, though increase in head shell volume is 
dramatic, for example, on the order of 50% in T4, this 
increase is insufficient to account for the amount of DNA 
translocated into proheads exhibiting capsid expansion 
(27). DNA translocation is ATP-dependent for all phages, 
and no link between prohead expansion and ATP hydrolysis 
has been described other than the synchrony of packaging 
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and expansion. In addition, <j>29, which exhibits capsomere 
structural change in packaging, does not show a detectable 
increase in shell volume (93). In phage T3, which initiates 
packaging into unexpanded proheads, expansion occurs 
discretely after a small portion of the DNA complement is 
translocated (86). Similarly, the capsomere change in 4>29 
probably occurs after only a few hundred base pairs of the 
DNA enter the prohead (7). In T3, packaging can be stopped 
in vitro after capsid expansion by the addition of ATP 
analogs such as y-S-ATP, and then restarted and completed 
into the expanded prohead upon restoration of ATP (86). The 
hydrostatic pump model has been revised such that a regen¬ 
erated hydrostatic pressure is maintained across the prohead 
shell which pulls the DNA into the capsid (85). This model 
still suffers, however, from the observations that effective 
translocation can occur after initial prohead expansion (79, 
86) and that expansion is most likely irreversible (91). 

The singular test case that unlinks expansion and DNA 
packaging is the report of in vitro packaging of a phage T4 
particle after both core cleavage and expansion (79). 
However, T4 DNA packaging in vivo cannot occur after 
scaffold cleavage and shell expansion (63). Temperature 
shift experiments with temperature-sensitive and cold- 
sensitive mutants in the T4 terminases show that the 
expanded prohead is not rescued in vivo. The only proheads 
that have not initiated packaging that can be rescued in 
similar experiments are scaffold-containing, unexpanded 
proheads (53), implying that after scaffold cleavage, the 
prohead proceeds down a defective pathway in vivo. To 
credit the in vitro experiments requires the assumption that 
expanded proheads can be rescued only with transfer into 
the in vitro world. Additionally, expanded T4 proheads 
assemble tails in vivo without packaging DNA during infec¬ 
tion with terminase mutants (52). This suggests strongly that 
expansion prior to DNA packaging is aberrant. It has been 
shown recently that the normal in vivo substrate for DNA 
packaging initiation in T4 is the unexpanded prohead, and 
that prohead expansion occurs after a significant amount 
of DNA enters the prohead, as in other phages (51). The 
newly reported high-efficiency in vitro T4 packaging proto¬ 
col (67) can be used to retest the activity of the expanded 
prohead. 

If DNA packaging is not mechanistically coupled to 
head maturation events, what is the impetus for scaffold 
exit and prohead expansion in DNA translocation? The 
answer may lie in subtle events of packaging initiation and 
early DNA translocation that involve connector confor¬ 
mational change and, consequently, irreversible conforma¬ 
tional change in the shell. Ordered virion assembly is 
ensured by binding the DNA packaging apparatus to an 
unfilled prohead, but not a filled particle. Thus, DNA pack¬ 
aging is coupled temporally to core cleavage and prohead 
expansion so that packaging precedes tail attachment. 

How DNA packaging triggers prohead expansion is 
unknown. Expansion of the T4 polyhead lattice in vitro 


under conditions of low salt is unidirectional and exo¬ 
thermic (90, 91). It is hypothesized that enough DNA to 
form a single layer within the prohead, contacting the 
inside of the head shell, might trigger expansion. However, 
apparently much less packaged DNA, on the order of a 
few hundred base pairs, is thought to trigger capsomere 
conformational change in <f>29, albeit without expansion. 
Expansion might also play a role in ensuring proper organi¬ 
zation of the packaged DNA by opening new binding sites on 
the inner surface of the capsomeres. Possibly changes in 
the connector may propagate a wave of expansion up the 
head. Connector conformational changes likely mediate 
packaging initiation, establish the sensing mechanism for 
headful packaging in some phages, and cue ordered tail 
attachment. A certain consequence of shell expansion is 
to provide stability to both the nascent and the mature 
virion. 

The Mechanism of DNA Translocation 

Union of the mature substrate DNA and the ATPase (termi¬ 
nase) holoenzyme complex with the viable prohead results 
in the assembly of a molecular machine that is capable of 
translocating DNA into the prohead. After decades of effort, 
the exact mechanism of DNA translocation is unknown. 
Key in the search for the mechanism of DNA translocation 
is that ATP hydrolysis is the driving force behind DNA 
translocation in all in vitro systems. Moreover, all identified 
DNA packaging holoenzyme complexes have the capacity 
to hydrolyze ATP. Therefore, the task is to define where, 
when and how ATP hydrolysis is used to move DNA around 
or through the connector and into the head. 

First, and most poorly understood, is the mechanism by 
which the free end of the double-stranded DNA substrate, 
once engaged with the head-tail connector, is introduced 
into the portal pore. While DNA deposition may involve a 
continuous transfer from the engaged ATPase complex 
through the connector pore, this delicate initiation must 
depend on the highly evolved fidelity of the terminase 
holoenzyme and connector. 

The mechanism of DNA translocation relates broadly to 
how molecular machines in general harness biochemical 
events to achieve movement of molecules. Included in the 
list of well-studied motors are the myosin ratchet, the Fl- 
ATPase rotary motor, and the RNA and DNA polymerases. 
The same questions that persist in these motors apply to the 
DNA translocation motor: Is there a bias to trap favorable 
Brownian motions by a sequence of small free energy drops 
(power stroke), or is a run of favorable thermal fluctuations 
rectified by a large free energy drop (Brownian ratchet) (75)? 

At the center of past efforts in describing the mechanism 
of DNA translocation lie the ATPases themselves. The 
simplest, and most appealing, mechanism is one in which 
the terminase holoenzyme plays a direct role in transloca¬ 
tion. It has been suggested that the oligomeric ATPase 
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Figure 6-3 An ATPase ratchet model of DNA translocation in T3. A: gp19 ATPase subunit positioned on the connector 
interacts with the phosphate backbone of the DNA. B: Upon hydrolysis of ATP, the gp19 monomer changes conformation, 
driving the DNA through the connector and into the prohead. C: This translocation event brings the backbone into alignment 
with the next consecutive ATPase monomer. D: the next translocation event is initiated. Reproduced from Fujisawa and 
Morita (34) with permission. 



holoenzyme associated with the connector during DNA 
packaging might be able to oscillate up and down with 
respect to the axis of the DNA entering the connector (34). 
In a mechanism similar to the myosin head ratchet, the 
ATPase subunits would “walk,” either individually or in con¬ 
cert, unidirectionally along the DNA helix, and the DNA 
would be translocated in the process (figure 6-3). The precise 
nature of the structural transitions in the ATPase complex 
that are required to fulfill this mechanism have yet to be 
described. 

Similarly, the ATPase holoenzyme complex could directly 
translocate the DNA into the prohead via a mechanism simi¬ 
lar to polymerase tracking along the DNA. If the ATPase 
complex is fixed at the connector portal and creeps along 
the DNA like a polymerase, the result would be transloca¬ 
tion. This polymerase-creeping model and the similar 
ATPase ratchet model draw support from the inhibition 
of packaging by DNA intercalating compounds and the 
detected ability to package DNA with gaps or nicks. This 
implies that the exterior phosphate backbone of the DNA is 
the point of interaction between the translocating motor and 
the substrate (34). Whether the ATPase holoenzyme plays a 
direct role in energy transduction or not, the hydrolysis of 
ATP catalyzed by the terminase complex plays a defining 
role, as evidenced by the effect of certain mutations in the 
ATPase region of the X gpA protein on the rate and efficiency 
of translocation in vivo (26). 

Many current models of DNA packaging hypothesize a 
role for the symmetry mismatch between the dodecameric 


connector and the 5-fold symmetric vertex of the icosahe- 
dral shell in which the connector is embedded (figure 6-4a). 
This symmetry mismatch potentiates rotation of the connec¬ 
tor within the prohead shell (44) by abrogating the rigid 
interaction of components of like symmetry. The symmetry 
mismatch of the connector and capsid is confirmed by cryo- 
electron microscopy three-dimensional reconstruction of 
c|>29, which also shows that the connector appears to fit 
loosely in the shell (93). Models have been put forward in 
which connector rotation either actively drives packaging 
or passively facilitates packaging. 

The original connector rotation model of packaging has 
the helical DNA being driven into the capsid, with either 
active or passive rotation of the connector, as a bolt passes 
through a rotating nut (figure 6-4b) (44). It is not clear how 
an active screw model would satisfy certain requirements of 
this mechanism, such as the need to axially restrain the 
DNA to prevent it from being rotated by the connector. 
Later, the observation that the <f>29 connector could wrap 
supercoiled DNA prompted a model in which a rotating 
connector would move the externally wrapped DNA relative 
to the head shell (95). This physical displacement could be 
harnessed in a number of ways to produce DNA transloca¬ 
tion, including direct translocation into the prohead similar 
to a ship’s capstan. Alternatively, twisting of the DNA 
causing the introduction of supercoils into the DNA by the 
rotating connector or by a packaging ATPase activity (10) 
could put strain on the DNA so that it enters the prohead 
to eliminate this superhelical stress. As yet there is no idea 
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Figure 6-4 Symmetry mismatch between the prohead and 
connector potentiates connector rotation and DNA translo¬ 
cation. A: Due to the 5-6 mismatch, only a single vertex of 
each component is aligned at any one time, thus facilitating 
rotation. If the contact point becomes a lever for translation 
of the connector with respect to the prohead (*), rotation 
can be driven by successive events around the subunits. 

B: If the connector acts like a nut and the DNA helix like a 
threaded bolt, connector rotation could displace the DNA 
into the prohead. Reproduced from Hendrix (44) with 
permission. 


of how movement between the connector and shell is 
mediated in the active connector rotation models, and 
connector rotation has not been detected in any phage 
system. The recent atomic structure of the <f>29 connector 


A B \~~| ir 



Figure 6-5 Translocation of DNA based on connector 
rotation and symmetry mismatch between the connector 
and the DNA in SPP1. A: Contact (arrow) between 
a monomer of the 13-fold connector and the phosphate 
backbone of the DNA helix is broken and the connector 
rotates 11° counterclockwise. B: Contact between the 
connector and the helical DNA is re-established three 
connector monomers to the right, two base pairs down the 
helix, with a consequential translocation of these two base 
pairs. Reproduced from Dube et al. (25) with permission. 


has generated a model of the packaging mechanism that 
combines the ratchet and rotation models (see below). 

A model that introduces a caveat to the active ATPase 
models above is one in which the head-tail connector of the 
phage might move along the DNA backbone in order to 
achieve DNA translocation. In rationalizing the observed 
T3-fold symmetry of the free SPPT connector, Dube et al. 
(25) proposed a mechanism of translocation in which mono¬ 
mers in the T3-fold portal interact in set sequence with the 
near TO-fold symmetric DNA helix. As monomers in set 
sequence bind the backbone of the helix, the DNA must be 
drawn into the prohead as the perpendicular alignment of 
the connector and DNA is maintained (figure 6-5). This 
model was based on a 13-fold model for the SPP1 connector, 
which has since been shown to be a dodecamer in the 
prohead like other dsDNA phage connectors (66). However, 
the 12-fold nature of the connector does not exclude this 
model. 

The <f>29 connector structure reveals a most interesting 
motif. Each of the 12 monomers spans the 7.5 nm high 
connector from top to bottom (39, 87). However, rather than 
simply traversing the connector, each monomer has three 
nearly parallel alpha helices that are canted at an angle 
approaching 30° to the axis of the connector, giving the 
overall structure the appearance of a spring. The structure 
of the connector therefore gives the impression that it is 
compressible. While no comparative structures have been 
presented to show connectors of different heights, atomic 
force microscopy has revealed that the connector can be 
reversibly compressed by 2.5 nm, about one-third of its 
height, under loads of 100 piconewtons or more (73). The 
connector structure and this demonstrated compressibility 
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Figure 6-6 Schematic of the compression ratchet model of 
the packaging mechanism. A: A packaging stroke begins 
with the alignment of the wide end of the connector with 
the prohead (gray circle) and the connector channel with the 
DNA helix (gray triangle); the connector is in a compressed 
form (right). B: As the connector narrow end releases the 
DNA (the wide end holds the DNA) and extends, it rotates by 
12° counterclockwise with respect to the head such that 
contact with the DNA shifts to the next pair of connector 
monomers (left) and two base pairs down the DNA helix 
(right). C: During the subsequent compression of the 
connector, the DNA, axially restrained at the narrow end 
(but released at the wide end), is driven two base pairs into 
the head (right); concurrently, the wide end of the connector 
rotates passively 12° counterclockwise with respect to the 
head (left), re-establishing contact with the head two 
connector monomers to the left. Reproduced from Grimes 
et al. (38) with permission. 

serve as the basis for the translocation model described 
below. 

It is proposed that the connector oscillates, extending 
and contracting along the long axis of the DNA inserted in 
the connector channel (figure 6-6) (87). There are two 
primary contact regions between the connector and DNA, 
at the connector wide and narrow ends, respectively. To 
start a cycle, the DNA is released by the connector narrow 
end, which rotates by 12° (counterclockwise as viewed 
toward the head) to maintain contact with the DNA back¬ 
bone as it extends down the DNA helix by 2 bp. Then the 
narrow end closes on the DNA as the connector contracts to 
drive the DNA into the head, and concurrently the connec¬ 
tor wide end releases the DNA and rotates 12° to realign the 
connector with the prohead, pRNA, and gpl6 ATPase. These 
components are reported to be in contact and possess 5-fold 


symmetry as suggested by cryo-electron microscopy. This 
cycle repeats. The connector rotation involved is passive 
and serves to maintain alignment between the 6-fold 
connector and 5-fold DNA. How ATP hydrolysis mediates 
these events is unknown, and no direct quantification of 
the connector dynamics involved has been reported. 

Structure of Packaged DNA 

Whatever the mechanism of DNA translocation, it must 
overcome the energetic barrier of compacting the DNA and 
deliver the DNA into the prohead to confer the proper struc¬ 
ture and organization within the head. DNA packaging in 
dsDNA phages is endothermic. Analogies are often made 
between DNA packaging and the collapse of DNA into a 
torus, or DNA toroid, in the presence of polyamines such as 
spermidine (29) or hexamine cobalt (98). While the final 
structures share structural similarities, such as the organi¬ 
zation of DNA in hexagonal bundles, the processes are very 
different. DNA condensation by polyamines or hexamine 
cobalt is spontaneous and exothermic, while DNA compac¬ 
tion in phages requires the input of energy mediated by 
enzymatic function of the packaging machine. The DNA 
toroid is stable, while the packaged DNA in phages is meta¬ 
stable. This distinction is crucial, since the function of the 
phage packaged DNA is to await delivery into a host cell, 
and it must be ordered in such a way as to permit disassem¬ 
bly of the structure during infection, unlike the DNA 
toroid which may be irreversibly condensed. Therefore, 
DNA packaging is defined as a compaction event rather 
than a condensation. 

Several facts seem incontrovertible in describing the 
structure and organization of the DNA in the phage head. 
The DNA is in B-form (2), with spacing on the order of 
2.5 nm (92). Regardless of the specifics of the overall organi¬ 
zation of the DNA, it would appear that it is locally associated 
in a hexagonal packing array, giving the DNA a quasi¬ 
crystalline appearance. What has been debated significantly 
is the overall organization of the DNA within the head 
shell. Models proposed over the past four decades describe 
the gross features of packaged DNA as a wrap solenoid, a 
liquid crystal, a spiral fold or a folded toroid (figure 6-7). 
Until recently, insufficient, and often conflicting, experimen¬ 
tal data existed to distinguish among these models. How¬ 
ever, mounting evidence pushes consensus in the direction 
of one of the original models of DNA organization in the 
phage head: the DNA solenoid. 

The model for packaged DNA structure that seemingly 
has the least amount of organization is the liquid crystal 
model (61). In this model, as in others, the bulk of the DNA 
is in tightly packed crystalline arrays. These arrays are 
small, however, and persist within the phage head as 
discrete domains with local structure, joined to other 
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Figure 6-7 Schematic of sectional views of models of DNA packing in dsDNA phage virions. All proposed models of DNA 
packing include hexagonally packed DNA, and differ according to the extent and type of global organization within the head 
shell. Models for global organization include: A: the solenoid, B: the spiral fold, C: the liquid crystal, and D: the folded toroid. 


randomly arranged packets of ordered DNA by short 
stretches of disordered DNA. The DNA becomes organized 
in this fashion as more and more DNA is translocated into 
the prohead, and the DNA condenses in small regions into 
hexagonally packed crystals. This is an intuitively appealing 
model, but much of the experimental observation regarding 
DNA structure within the phage head appears to be in 
conflict with this model. 

Microscopy studies suggest there is higher symmetry to 
packaged DNA beyond the level of hexagonal packing 
promoted in the liquid crystal model. DNA released from 
disrupted heads often appears as a large coil, suggesting 
a gross organization of the whole chromosome (28). Cryo- 
electron micrographs of several phage heads and similar 
structures reveal patterns in the compacted DNA which 
resemble fingerprints, suggesting a pattern of loops of DNA 
inside the head (18, 82). Three models considered below that 
describe gross organization of the DNA in the head have 
experimental support. What is crucial, however, is that the 
model accounts not only for how the DNA can be accommo¬ 
dated in the space of the prohead shell, but also for the DNA 
translocation event. 

One model for packaged DNA is based on a derivative of 
the condensed toroid (48, 49). The DNA enters the prohead 
and forms a large donut-shaped structure of hexagonally 
packed DNA with a hole in the middle, essentially a DNA 
toroid. As the length of the DNA inside the prohead 
increases, the toroid collapses into a folded structure, and 
the packaged DNA density reaches the level found in the 
mature head. But this attempt to relate the structure of 
the toroid with packaged DNA fails in that the diameter of 
the torus that collapses into the folded form inside the head 
is much greater than the diameter of the prohead shell itself. 
To arrive in this final form, the DNA must organize into the 
folded toroid from the outset, or a smaller torus having the 
size of the head must rearrange into a larger toroid ring of 
the diameter of the final folded form. These events are 
unlikely, especially given the requirements for gross rear¬ 
rangement and the energetically unfavorable sliding of 
DNA. Thus, while this model may have merit in describing 
how DNA can fit into the prohead, it does not address the 
obligate process of how DNA will reach this form during 
DNA packaging. 


A second and earlier model that was derived from experi¬ 
mental observation is the spiral fold model (8,9). Like others, 
this model describes hexagonally packed DNA, but unlike 
the liquid crystal model, the entire chromosome is orga¬ 
nized. Briefly, the DNA is arranged as a bundle of straight 
rods formed by the up and down winding of the DNA along 
the long axis of the prohead, with the DNA bending back on 
itself repeatedly, making 180° turns. Several lines of evidence 
support this model. A series of cross-linking experiments in 
lambda in which fcis-psoralen agents are used to interro¬ 
gate the detail of interaction between packaged DNA and 
the phage head shell suggest that the DNA contacts the 
head shell every several hundred base pairs (99). Thus, the 
kinked DNA at the end of each spiral fold contacts the shell. 
Ion etching experiments on A, (9), capable of probing the 
spatial arrangement of DNA within the virion, also support 
this model. 

The spiral fold model, like the folded toroid, presents a 
reasonable form for the DNA in the full head but is counter¬ 
intuitive with the nature of DNA translocation. The sharp 
180° bends in the DNA described by the spiral fold require 
the DNA helix to melt at these points. Considering that 
the DNA is processively driven into the head, it is difficult 
to imagine what would force the first complement of DNA 
to align in this way with such severe distortion of the 
helix. Seemingly the DNA would prefer to trace a path 
around the sphere of the prohead rather than reverse direc¬ 
tion and fold back on itself, since doing so would deny the 
high persistence length, charge repulsion, and entropic 
nature of DNA. Recently, new in vivo intracapsid DNA 
cleavage experiments (72) have led to a slight revision 
of the spiral fold, bringing it more in line with the solenoid 
packing model (below). 

A third model of packaged DNA, the solenoid, has 
received ongoing attention for four decades. This model has 
been reinvigorated by recent support derived from cryo- 
electron microscopy, which is capable of revealing relatively 
fine detail without the impediment of artifacts generated by 
fixation or staining seen in traditional transmission electron 
microscopy. In most phages, raw images of filled heads or 
virions reveal a characteristic fingerprint pattern. Cerritelli 
and others (18) exploited the ability to preferentially orient 
phage T7 heads in vitreous samples, revealing that all T7 
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heads have this characteristic pattern. By processing images 
of these oriented head particles and by regenerating similar 
images using a theoretical model of a DNA solenoid, some of 
the best evidence is provided that DNA in phage heads is 
organized in a layered spool. Assessment of the DNA second¬ 
ary structure of T7 and other phages by RAMAN spectro¬ 
scopy is also seen to support this model (76), as does recent 
theoretical work (see below). 

As mentioned above, the key consideration in exploring 
the structure of packaged DNA is the nature of DNA itself. 
First, DNA is stiff and has a relatively long persistence 
length: DNA in solution does not bend back on itself over a 
distance of less than 50 mn. Second, the highly charged 
phosphate backbone of DNA makes it self-repulsive: in 
order to compact DNA to a spacing of 2.5 nm and within 
the confines of the phage head, this charge repulsion must 
be overcome and will play a role in determining the organi¬ 
zation of packaged DNA. Considering these conditions and 
the rules of entropic confinement allows for a physical 
reconstruction of packaged DNA that supports the solenoid 
model. 

Providing more detail, when DNA enters the prohead, its 
stiffness and self-repulsive nature dictate that it remain 
relatively straight over the short distance from one side of 
the prohead to the other. When a length of DNA equivalent 
to several lengths of the prohead has been translocated, how 
will the DNA respond? It is likely that the DNA will form 
loops within the confines of the prohead, following the long¬ 
est path it can around the inner surface of the prohead 
shell. As more and more DNA enters the prohead, concentric 
shells of DNA form, pushed outward from the center of the 
prohead, driven by the persistence of the DNA. As these 
layers form, charge repulsion between strands pushes back 
sequentially from layers at the outside of the shell. Thus the 
properties of DNA itself are enough to confer some level of 
higher order structure of the packaged DNA in that a stable 
equilibrium forms between the persistence of the DNA, 
which pushes the DNA away from the center of the prohead, 
and charge repulsion that pushes back. The model of such a 
solenoid structure formed in this fashion is one of the oldest 
proposed for packaged phage DNA, but only recently has the 
model been dealt with theoretically to a suitable degree. 

Odijk (74) has calculated the spacing prescribed by such 
a model and compared it with the observations made by 
Cerritelli et al. (18). It appears that the spacing observed in 
T7 heads containing different amounts of DNA agrees well 
with the theoretically derived spacings based on the prin¬ 
ciple of equilibrium between DNA stiffness and self¬ 
repulsion. Kindt et al. (56) took this theory one step further 
in an attempt to replicate the way in which DNA is organized 
in the prohead during translocation. Their dynamic model 
interrogates how DNA organizes itself during translocation 
and not simply after the entire DNA complement has 
entered the prohead. This effort reveals that, at first, the 
DNA is disordered inside the prohead shell, but soon adopts 


the conformation of the outer layers of a solenoid. As more 
and more DNA enters the confines of the head shell, con¬ 
centric layers form from outside to inside of the solenoid. 
Although the size of this theoretical prohead deviates from 
real phage systems by several-fold, the principle supports 
the idea that the physical nature of DNA alone can organize 
the DNA within the prohead. 

The end state of the packaged DNA also relates directly to 
the mechanism of DNA translocation in another way. For 
example, if the final conformation of the DNA inside the 
capsid shell is a solenoid, then as the DNA enters the prohead 
through the connector it must rotate axially with respect to 
the prohead as the incoming DNA winds in concentric coiled 
rings. A variant of the solenoid proposed by Serwer (84) 
abrogates this necessity in that it was proposed that rather 
than spooling continuously in one direction, the DNA 
reverses direction occasionally in its path around the sole¬ 
noid. If no such reversals occur in the solenoid, then the 
translocation mechanism must accommodate the axial 
rotation required. The rotation of the prohead connector 
potentiated by the symmetry mismatch between prohead 
shell and connector leaves open the possibility that, regard¬ 
less of the details of the mechanism of translocation, such 
required DNA rotation can be accommodated. Such a 
rotation event could also occur, or might be necessary, 
during DNA ejection during infection. In a reversal of 
packaging translocation, the connector and attached tail 
components might rotate relative to the head while the 
DNA moves into the host cell. 

The final structure of the DNA brings up an additional 
point: how much energy is invested in the DNA? This is rele¬ 
vant to the energetics of DNA translocation in that the 
packaging machine must maintain the capacity to drive the 
DNA into the prohead. Recent single-molecule optical trap 
studies with the <f>29 system have revealed the force-velocity 
relationship of the packaging motor (88). The DNA pack¬ 
aging machine is remarkably strong at the molecular level, 
having a maximum stall force on the order of 70 pico- 
newtons. By comparison, this is 5 times stronger than the 
classical myosin molecular motor (58). Why is the packaging 
machine so powerful? Force-velocity measurements reveal 
that the last portion of the c(>29 chromosome enters the 
prohead against an internal force of 50 piconewtons. This 
suggests that the pressure of the packaged DNA within the 
prohead, and thus the force opposed in translocating the 
last segment of DNA, is on the order of 6 megapascals. 
Previous studies suggest that the containment pressure of 
the packaged DNA might assist in ejection of the DNA 
into the infected host cell. The c(>29 work suggests that a 
significant amount of energy is available for such a process. 
However, the <j)29 single-molecule studies do not discern 
whether all the energy consumed by the packaging machine 
is deposited in the packaged DNA, or whether some energy is 
dissipated. Theoretical estimates for the stored energy of 
the packaged DNA vary, and it is not clear whether the 
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experimental set-up of the optical trap affects the packaging 
motor. However, the internal capsid force estimated in 
4>29 agrees quite well with the force calculated for expulsion 
of T7 wild-type versus deletion mutant DNAs in titration 
calorimetry (78). 


Conclusions and Future Considerations 

The efforts and information described above are directed 
toward a single goal: the complete understanding of the 
processes and events of DNA packaging in dsDNA phages. 
While much progress has been made, many questions 
remain. The elucidation of the mechanisms involved in 
DNA packaging can be tackled like any other molecular 
mechanism by following the “path to enlightenment” (1), 
which requires the following: (i) a complete list of compo¬ 
nents, (ii) the description of all intermediates, (iii) the 
kinetics of all reactions, and (iv) atomic structures of all 
components. Some of the gaps in this effort include those 
listed below. 

Have all the components in the DNA packaging reaction 
been accounted for? Recent description of the role of theT4 
late sigma factor, gp 55, in DNA packaging (67) suggests that 
we should be ever vigilant for missing components which 
may play a vital mechanistic role in DNA packaging. To date, 
only a single instance of the involvement of a packaging 
RNA has been described, in <f>29. The onus is on investigators 
in the field in general to seek out and remain open to similar 
new components in all systems under study. 

An ongoing controversy deals with the interaction of 
individual processes, such as prohead maturation and the 
initiation of DNA translocation, and whether such events 
are mechanistically coupled. To completely understand any 
given individual event, we must be aware that processes 
might depend upon, and in fact be driven by, other 
seemingly independent processes. 

Many of the unknown factors that remain to be investi¬ 
gated relate to the structure and interaction of components. 
Among these is the question of the order of assembly of 
packaging ATPase components, DNA substrate, and recepta¬ 
cle prohead. Is the order of assembly in vivo described in X 
the same in other systems, with the complex I structure, 
comprised of terminase and DNA, forming separately from 
the prohead? Does the prohead play an integral role in DNA 
maturation, as appears to be the case in PI? Can such inter¬ 
actions help describe the control of DNA processing from 
concatemers within the crowded environment of the 
host cell and the mechanism of DNA maturation prior to 
translocation? 

A considerable effort has been directed toward solving 
the structures of components of the DNA packaging 
machine at high resolution. Much work remains to be done. 
The singular example of the solution of the cj)29 connector 


structure must (and will) be matched with connectors from 
other systems. Solution of the structure of packaging ATPase 
subunits is crucial in building a complete mechanism of 
their action (24). Atomic resolution of the prohead with 
embedded portal is as important. Finally, a complete picture 
of packaging will only be available when structures of all 
the various intermediates are solved, including that of the 
intact DNA packaging machine at various points in the ATP 
hydrolysis cycle. 

Broader issues precede such atomic-level concerns. What 
is the role of symmetry between components in the DNA 
packaging motor? Does rotation of components play a role, 
as proposed in several models described above? Will new 
motifs for molecular motors be revealed as we approach 
complete elucidation of the mechanism of DNA packaging? 

As a relatively mature discipline within molecular 
biology, research on phage DNA packaging has enormous 
advantages. The genetics for most systems are well estab¬ 
lished and the production of large quantities of materials is 
relatively easy, particularly when compared with eukaryotic 
systems. This puts the study of dsDNA packaging in the 
enviable position of being primed for the application of new 
technologies within the fields of biology and biophysics. 
Among these are single molecule approaches based upon 
optical tweezers and atomic force microscopy. Advanced 
spectroscopy, including fluorescence, RAMAN, EPR, FRET, 
and many others, seem tailor-made for many of the ques¬ 
tions waiting to be answered about the processes involved 
in packaging. In addition, advances in soft matter physics 
seem newly capable of providing theoretical insight into the 
problems and mechanisms involved in DNA packaging, 
perhaps allowing a return of phage research to its roots 
in physics. 

Lastly, the context in which current and future results 
are viewed must continually be brought back to their cel¬ 
lular origin. As complex as these processes might seem in 
vitro, they are perhaps more complex in the in vivo world. 
This context is crucial if we expect to fully understand 
these events and processes, and later apply them to other 
areas of interest. 
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C ell lines harboring latent viruses are common both in 
eukaryotes and in prokaryotes. Prokaryotes harboring 
latent phages are lysogenic, and the latent form of the phage 
is called a prophage. Phages that can enter such a latent state 
are called temperate, although some authors erroneously 
refer to them as lysogenic. 

Lysogeny was discovered because some bacterial isolates 
spontaneously produce small amounts of infectious phage. 
It was later shown that lysogenic bacteria could arise 
during laboratory infection. 

Lysogeny is heritable within a bacterial lineage. When a 
lysogenic cell divides, both daughter cells harbor the pro¬ 
phage. Very occasionally, a lysogenic cell spontaneously 
lyses and liberates phage. Lwoff and Gutmann (5) demon¬ 
strated this by separating bacterial cells at division by micro¬ 
manipulation and testing their descendants. Their results 
eliminated the possibility that lysogeny might be an artifact 
of reinfection within a culture. 

The existence of lysogeny poses three basic questions, 
which have diverse answers among the various known 
temperate phages: (i) What is the physical nature of the 
prophage? (ii) What ensures that the prophage replicates 
during each division cycle and is segregated to both daugh¬ 
ters at cell division? (iii) Since the prophage must contain all 
the genes needed to carry out a lytic cycle, what restrains 
it from doing so? The first two questions are very closely 
connected. The third was a cornerstone of Jacob and 
Monod’s (4) proposal that transcription initiation is a general 
mechanism for controlling gene expression. 

Although all temperate phages must achieve regular 
inheritance and control of lytic functions, they accomplish 
this in diverse ways. The purpose here is neither to chronicle 
all this variety nor to detail the mechanisms of particular 
phages discussed in other chapters. This chapter attempts 
rather to outline some frequently used mechanisms, empha¬ 
sizing their relevance to the general goals for which they 
were selected. 


Nature and Mode of Replication 
of the Prophage 

Nature of the Prophage 

In lysogens of most temperate phages (including coliphages 
X and Mu-1), the complete phage genome is inserted into the 
continuity of the bacterial chromosome. Prophage DNA is 
replicated and segregated as part of the chromosome. No 
phage-specified enzymes are needed for prophage replica¬ 
tion. Synthesis of the replication enzymes used during lytic 
infection is repressed, and their untimely synthesis can be 
deleterious to the cell. 

The major alternative mode of lysogeny (used by phage 
PI) is for the phage genome to become established as a plas¬ 
mid, separate from the chromosome, which segregates in a 
regular manner at cell division. The plasmid form is typically 
circular and encodes some of the enzymes needed for its 
own replication. In phage PI, both the replication genes and 
the replication origin used by the prophage are different from 
those used in the lytic cycle. 

Table 7-1 lists some common phages classified according 
to the state of the prophage. Among the inserted prophages, 
two very distinct mechanisms of insertion are recognized. 
Three groups of phages (exemplified by X, Mu-1, and PI, 
respectively) will now be discussed in more depth. 

Prophages Inserted by Reciprocal, 

Site-Specific Recombination 

X is one of the many phages that insert their DNA into the 
host chromosome by reciprocal recombination between 
the chromosome and a circular form of the phage DNA. 
The overall reaction comprises cointegration of the two 
circles. The recombination takes place between a unique 
site on the phage DNA and a unique site on the E. coli 
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Table 7-1 Modes of Prophage Maintenance 


Mode 

Example 

Insertion 


By site-specific recombination 

Coliphage /, 

Salmonella phage P22 
Mycobacterial phage L5 
Vibrio phage CTX Phi 
Streptomyces phage 4>C31 

By transposition 

Mu-1 


D108 

Plasmid formation 

Coliphages PI, P7 
Salmonella phage D6 


chromosome (figure 7-1). Host and phage DNA are identical 
in sequence for 15 bp at the crossover point. Fifteen base 
pairs is not enough to allow recombination by the general 
recombinase of the host ( rec ) or of the phage (red). Phage X 
encodes a protein (integrase) that efficiently mediates 
reciprocal recombination at these sites. Integrase is made in 
greatest abundance in cells that are recovering from infec¬ 
tion. After lysogeny is established, the integrase gene is 
turned off. 



Figure 7-1 Integration by site-specific recombination. 
Integration occurs between the attP site of the phage and 
the attB site of the host. The two circular molecules form 
a cointegrate, with junction points attL (left prophage end) 
and attR (right prophage end). Integration is carried out by 
a phage-coded protein, integrase. In many phages (including 
X), the reverse reaction (excision) frequently requires a 
second phage-coded protein, excisonase. Host proteins 
contributing to the reaction are not shown. 


This is precisely what is expected of a mechanism to 
promote lysogeny: it should go on after infection, so that 
every surviving cell has an inserted prophage. It is unneces¬ 
sary in an established lysogen, where insertion has already 
occurred. Finally, if lysogeny is disrupted and the cell 
switches back to the lytic cycle (as occasionally happens 
spontaneously and can be induced), the prophage should 
be excised from the chromosome. For reasons that are not 
completely understood at a biochemical level, the insertion 
reaction is not directly reversible. To excise the prophage, 
another phage protein, excisionase, is needed. Following 
induction, the two proteins are produced coordinately. 

The integrase/excisionase system puts the phage in 
charge of the timing and direction of site-specific recombi¬ 
nation. If the phage and bacterium had a longer stretch of 
homology and depended on general recombinases, insertion 
and excision would happen rarely and haphazardly. 

The effective irreversibility of the insertion reaction in 
the absence of excisionase has suggested that it might 
be adapted to insert DNA into specific sites of the human 
genome for gene therapy (7). The idea is to promote stable 
insertion into some human sequence resembling the bac¬ 
terial site. The integration system of Streptomyces phage 
(|>C31 has been used for this purpose. 

X inserts in intergenic DNA, but many other temperate 
phages, including some natural relatives of X, insert at 
sites that are within structural genes (21) or tRNA genes 
(P22). What these sites have in common is an interrupted 
(frequently imperfect) dyad symmetry centered on the cross¬ 
over point. In tRNA genes, this configuration is found in 
the anticodon loop, which is used not only by the ^-related 
phage P22 but also by coliphage 186, Haemophilus phage 
HP1 and mycobacterium phage L5. 

The X integrase reaction proceeds through two successive 
strand exchanges placed 7 bp apart. The symmetry of the 
DNA site allows equivalent recognition at the two exchange 
points. The sites used by some phages have no obvious 
symmetry. The satellite coliphages P4 and its relatives 
insert into tRNA genes, but in theTi[/C loop near the 3' end, 
with no DNA symmetry centered on the insertion point. P4 
uses an integrase of the same superfamily as X integrase, but 
the two integrases are barely related. Streptomyces phage 
c[>C31 inserts into a site with no apparent symmetry, using 
a site-specific recombinase from a different superfamily. 
Thus the site-specific recombination systems used for inser¬ 
tion/excision employ various mechanisms, but all produce 
the same final result. 

Insertion by site-specific recombinases requires a 
double-stranded circle of phage DNA (figure 7-1). As pack¬ 
aged in the virion, the DNAs of most temperate phages are 
not double-stranded circles. They are generally converted to 
that form early in phage development. X, for example, has 
linear DNA with projecting complementary single-stranded 
5' ends, which pair following infection, allowing ligation to 
a double-stranded circle. The cholera phage CTX Phi has 




68 PART II: LIFE OF PHAGES 



Figure 7-2 Some pathways to the double-stranded, circular 
integration substrate. See text for details. 

single-stranded circles in the virion, but like most phages, 
it replicates as a double-stranded circle. Phage P22 has 
headful packaging, where the DNA molecule injected 
from the virion has a direct repetition (different among 
virions) of double-stranded DNA at the ends. Within the cell, 
homology-dependent recombination produces a circle (the 
same for all virions). Some of these pathways to the circular 
integration substrate are illustrated in figure 7-2. In all these 
cases, the circular form is an intermediate in lytic develop¬ 
ment as well as in integration. 

Although site-specific recombinases of the integrase 
family occur in eukaryotes, they are not known to be used 
in viral integration. The dependent parvoviruses (so-called 
adeno-associated viruses) insert with high preference in a 
specific small human chromosomal segment, but there is 
no indication that the mechanism has much in common 
with phage insertion (8). 

Specialized Transduction 

Transduction is the transfer of genes from one host cell 
to another by a virus. There are two types: generalized 
(which results from errors in packaging) and specialized 
(which results from errors in excision from the chromosome 
of a lysogenic bacterium). Generalized transduction is not 
directly related to lysogeny. It is most easily observed with 
temperate phages, but only because the potential transduc- 
tants formed by a virulent phage are generally destroyed by 
lysis. This lysis is not caused by the transducing particles, 
which usually contain only host DNA, but rather by the 
phage particles in the lysate. 

Specialized transduction occurs only with temperate 
phages, and is observed only in lysates produced by inducing 
lytic development in a lysogen, not in lysates produced by 
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Figure 7.3 Formation of specialized transducing phages by 
bacteriophage X. The central column shows normal excision 
of the phage genome, mediated by integrase and 
excisionase. For every such excision, there are about 1CT 5 
abnormal excisions, where either gal or bio is effectively 
cloned into X. These are rare because they require breakage 
and joining of heterologous DNA. The reciprocal product 
(a deleted chromosome), which may or may not be formed 
in the same event, is not shown. The presence of the cos 
site allows the excised DNA to be packaged into virions 
and injected into other cells. Since X is cut at cos 
during packaging, part of cos occurs at each end of the 
packaged DNA. 


infection. Specialized transduction results from rare faulty 
excision from the lysogenic chromosome. Instead of the 
precise excision mediated by integrase and excisionase 
(figure 7-1), an unknown enzymology causes breakage and 
joining in heterologous DNA to produce packageable 
chimeric genomes, part phage and part host. This can be 
viewed as a natural form of gene cloning. Whereas any 
part of the host genome can be transferred in generalized 
transduction, specialized transduction is possible only for 
a limited portion of the host chromosome flanking the 
phage insertion site. 

Figure 7-3 shows specialized transduction by X of some 
nearby genes —gal from one side and bio from the other. 
X gal and X bio arise in separate events. Several constraints 
on the process may be noted: (i) The genome of a specialized 
transducing phage is a connected segment of the lysogenic 
chromosome. Therefore, any X carrying gal will also carry 
all the genes between gal and attX, and any X carrying the 
distal gaTE gene must also carry galK. (ii) The amount of 
DNA is restricted by the packaging limits of the phage. X 
can package up to 110% of its 48.7 kb genome, so no gene 
can be packaged that is farther than that from X cos. By 
extension, if a X has accidentally inserted somewhere other 
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than its normal site, it will now transduce only genes 
close to the new site, (iii) Specialized transducing phages 
are frequently missing some phage genes. As implied by 
figure 7-3, acquisition of host DNA to the left of the prophage 
is accompanied by loss of genes from the right end of the 
prophage. The phage genes that are eliminated may be vital 
for the viral life cycle, so propagation of the specialized 
transducing variant can take place only in cells coinfected 
with a complete X. (iv) The packaging site, cos, cannot be 
lost because it is needed in cis. Therefore, all gal-transducers 
will include genes int-cos, and all bio transducers will 
include cos-J. 

The last line of figure 7-3 compares the structures of X 
gal and X bio DNA with that of normal X, as packaged in the 
virion. The termini (produced by cutting at cos) are the same. 
X gals have a segment of host DNA whose left margin may 
vary widely but whose right end is always at the phage inser¬ 
tion site; in X bios, the host DNA runs rightward from the 
insertion site. 

Figure 7-4 shows insertion by the X gal specialized trans¬ 
ducing phage. A gal may insert into the chromosome either 
by general, homology-dependent recombination or by site- 
specific recombination at att. The insertion is some¬ 
times accompanied by insertion of a coinfecting normal 
X (not shown in figure 7-4). The requirements for the inte- 
grase reaction using the hybrid site attL on the X gal are 
different from those for a phage attP site; this generates a 
helping effect by the coinfecting X. The product of lysogen- 
ization contains two copies of gal: the indigenous copy 
already present before infection (shown as gaV in figure 
7-4) and the added copy from the X gal (gal + in figure 7-4). 
Homology-dependent recombination within the chromo¬ 
some can “loop out” one copy or the other, so that lysogens 
of X gal are inherently unstable. 

Insertion and excision by X gal have served as a model for 
replacement of genes by cloned, mutated copies. Whereas X 
insertion requires integrases, X gal insertion can also take 
place through homologous recombination with bacterial 
DNA. The haploid segregants can be either GaP (reversal) 
or Gal + (replacement) depending on which side of gal the 
crossover occurs to loop out the X gal. 

Prophage Insertion by Transposition 

The most familiar example of insertion by transposition is 
coliphage Mu-1. There are fundamental differences between 
the life cycles of Mu-1 and X. In Mu-1, there is no circular 
form of the viral DNA. Also, viral DNA is always inserted 
into host DNA, even in the virion. Both replication and inser¬ 
tion result from transposition of viral DNA from one host site 
to another. 

Transposition is diagrammed in figure 7-5. Transposition 
has two possible outcomes, replicative and nonreplicative, of 
which only the nonreplicative concerns us here. In cells 
destined to lyse, the replicative mode is used and reused, 
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Figure 7-4 Insertion and looping out of specialized trans¬ 
ducing phage. The X gal + can insert into the chromosome 
either by site-specific recombination at att or by homology- 
dependent recombination in host DNA between gal and attX. 
The product (a heterogenote) has two copies of gal, 
separated by X DNA. During growth of the strain, the X gal 
can occasionally loop out through homology-dependent 
recombination. Depending on the position of the crossover 
point, the haploid derivative can carry either the gal gene of 
the original chromosome or that brought in by the phage. 
Looping out to give the chromosomal gene happens at 
about 1CT 3 per generation. The rate for the phage-derived 
gene (in this case, gal + ) is lower and depends on the extent 
of homology to the left of gal. The reciprocal product (an 
excised X gal which, if formed, does not replicate and is 
therefore diluted out by growth) is not shown. 


moving viral DNA into different sites and rearranging the 
chromosomal DNA many times before the cell finally lyses 
and liberates a crop of progeny virions. During packaging, 
DNA is cut at nonspecific positions distal to the termini of 
viral DNA, so that each virion contains a small segment 
of host DNA (typically about 2 kb) with the 39 kb viral 
genome inserted into it. In cells destined for lysogeny, non¬ 
replicative transposition splices viral DNA into the chromo¬ 
some. Any site on the chromosome can be used as a target for 
transposition, so each lysogen has a unique insertion site. 
Insertions within genes generally disrupt gene function, 
which constitutes a kind of mutation, hence the name 
“mutator phage.” 

Mu-1 does not produce specialized transducing phages. 
Every virion contains about 2 kb of flanking DNA; but these 
are separated from viral DNA in the first cycle of transposi¬ 
tion following infection. However, Mu-1 has been modified 
by genetic engineering to allow in vivo cloning of host DNA 
segments of 5 kb or more (figure 7-6). 




70 PART II: LIFE OF PHAGES 



DNA in 
virion 


Target 

(host chromosome) 



Figure 7-5 Chromosomal insertion of phage Mu-1 DNA by a 
transposition mechanism. In the virion, Mu-1 DNA is 
attached to host DNA at both ends: less than 100 bp on 
the left, and about 2 kb on the right. Any part of the 
chromosome can be used as target. The first step of 
transposition is strand transfer, where the 3' ends of Mu are 
ligated to opposite strands of target DNA 5 bp apart, leaving 
free 3' ends of the chromosomal DNA and free 5' ends of the 
host DNA from the virion. The strand transfer product is 
redrawn in the next line for ease of visualization. Repair 
synthesis initiated from the 3' ends of target DNA fill the 
gap, the host DNA that entered from the virion is trimmed 
off, and the target DNA 3' ends are ligated to the 5' Mu-1 
ends. Mu-1 can also undergo replicative transposition where 
the donor DNA is not trimmed and the transposon is 
replicated; however, this is only fruitful with a circular donor 
and does not occur in the initial transposition from linear 
virion DNA. 
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Figure 7-6 In vivo cloning with mini Mu lac. The initial mini 
Mu lac includes, between Mu-1 ends, a selectable marker 
(Cm R ), a plasmid replication origin (or/), and a lacZ gene 
suitable for constructing translational fusion. It can be 
packaged by complementation and introduced into a cell 
harboring a complete Mu-1 prophage. Induction of phage 
development leads to repeated rounds of replicative 
transposition (chapter 30) of both Mu-1 and mini Mu lac, so 
that the cell population will include cells where any desired 
gene is flanked by two directly oriented mini Mu lac 
prophages at an appropriate distance so that the two mini 
Mu lacs plus intervening DNA can be packaged into a Mu-1 
virion. In the recipient, homology-dependent recombination 
excises a plasmid containing the cloned gene X. Based on 
Groisman et al. (2). The method has also been modified to 
use a mini Mu with X cos sites, packageable by a comple¬ 
menting X, which allows a cloning of longer segments (1). 


Among eukaryotic viruses, the retroviruses employ an 
insertion mechanism almost identical to that of Mu-1. The 
rest of their life cycle is quite different, however. 

Lysogenization by Plasmid Formation 

Phage PI is the prototype of lysogenization by plasmid 
formation. Like P22, PI DNA is packaged as circularly 
permuted, terminally repetitious molecules, which circula¬ 
rize soon after infection through recombination between 
the terminal segments (cf. figure 7.2). In cells destined for 
lysogeny, the circular molecules persist as plasmids, replica¬ 
ting once during each cell cycle and segregating regularly at 
cell division, like the bacterial chromosome. The replication 
origin for plasmid replication (which is distant from the 
origin used in lytic development) is closely linked to the 
determinants for orderly segregation. 

PI encodes a site-specific recombinase, Cre, that acts on 
a unique, completely symmetrical site ( lox ) on PI DNA. 


There are no lox sites on the bacterial chromosome, and 
the Cre lox system is not used for insertion (except for 
rare, aberrant insertions into pseudolox sites). Cre serves 
two functions in the phage life cycle: (a) Cre can mediate 
the circularization of viral DNA that happens early after 
infection (cf. figure 7-2). That only occurs among those 
molecules where the lox site is within the terminal repeti¬ 
tion, but because PT initiates packaging at a preferred site 
(pad), such molecules are frequent. This circularization is 
needed for lytic as well as lysogenic development, (b) Cre 
facilitates segregation of the plasmid at cell division, 
thereby increasing the stability of the lysogenic state. 
During or after PI replication, homologous recombination 
between the two daughter molecules may create a dimer. 
At cell division, the dimer can only go to one of the two 
daughter cells, leaving the other plasmidless. Site-specific 
recombination at lox resolves the dimer into two mono¬ 
mers, which can then segregate normally. The rate of spon¬ 
taneous loss of the PI plasmid is less than I0~ per cell 
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division with wild-type PI and 10 “ when the lox has been 
deleted (6). 

Some eukaryotic DNA viruses, including Epstein-Barr 
and papilloma, can also establish themselves as plasmids 
that replicate within the nucleus. 

Defective Phages and Prophages 

Natural bacterial strains frequently harbor incomplete 
prophages (also called defective or cryptic ) as well as complete 
ones. In the wild, bacteria are continually subject to infec¬ 
tion, reinfection, and lysogenization. As lysogenic bacteria 
grow, their prophages experience deletions, mutations, or 
insertions within the prophage genome. There is generally 
no selection against such events, so they accumulate over 
evolutionary time and are frequently observed in laboratory 
lysogens as well. Defective lysogens were recognized early in 
the modern era of phage biology as strains that retained 
some properties of lysogens (such as immunity to infection 
by phage of the carried type) but did not liberate plaque¬ 
forming particles. Whole genome sequencing has under¬ 
scored their frequency in natural populations. For example, 
the K-12 strain of E. coli harbors (in addition to A,), four or five 
A-related defective prophages, and the enterohemorrhagic E. 
coli H 7:0157 has seven (including a defective prophage at the 
A site; 3). These strains also carry several elements related to 
phages P4 and P2. Several percent of the typical bacterial 
genome is phage-derived. A similar fraction of the mamma¬ 
lian genome is virus- (especially retrovirus-) related. 

Some of these elements have attracted special attention 
because they contain, between attL and attR sites, both 
phage-related integrase genes and genes with special bac¬ 
terial functions, such as pathogenicity. It is generally not 
known how the specialized genes became associated with 
the element or whether the complete element inserts or 
excises. 

Gene Regulation by Repression 
Regulation in a Lysogen 

In a lysogenic bacterium, the genes of the phage lytic cycle 
must be turned off, or the cell would die. Turnoff is caused 
by a phage-coded repressor (usually a protein, but in some 
cases, as with phage P4, an antisense RNA). The repressor 
prevents transcriptional initiation from key phage promot¬ 
ers. This turns off almost all phage transcription. Repressor 
acts directly on only a few promoters (in A, there are two). 
These are the promoters from which lytic development is 
initiated. Phage genes expressed late in lytic development 
are turned off indirectly, by the lack of early gene products 
needed to activate their expression. The skeletal control 
system in A (where the repressor is coded by the cl gene) 
is shown in figure 7-7. Repressor not only turns off 
transcription from the lytic promoters pL and pR but also 
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Figure 7-7 Outline of gene control in a A lysogen. Repressor 
(product of gene cl) binds to operator sites (oL, oR, also 
known as 0 L and 0 R , respectively), turning off transcription 
from pL (also known as P L ) and pR (also known as P R ) and 
turning on cl transcription from pM. Repressor blocks 
transcription of late genes indirectly, by preventing tran¬ 
scription from gene Q, whose product turns on late gene 
transcription of gene N , whose product antiterminates early 
gene transcription. 

promotes its own transcription. This is accomplished by 
specific binding to a cluster of three repressor-binding sites 
at each of the two operators. Late genes determining cell 
lysis and virion components are not expressed in the lysogen 
because their transcription is turned on only when the 0 
protein antiterminates the pR 7 transcript, and the Q gene is 
transcribed from pR. 

The lysogenic state is quite stable. About once in 10 4 cell 
divisions, lytic development is spontaneously activated, and 
the cell lyses and liberates phage. In many phages (including 
A), lytic development can be induced experimentally. The 
most convenient method of induction is to heat a strain that 
harbors a prophage with a ts mutation in the cl gene, so that 
the repressor is thermosensitive and denatures at high 
temperature. A more natural method is to treat the cell with 
various agents (such as ultraviolet light) that activate the 
bacterial SOS system. 

The SOS system is a global regulatory system responsive 
to DNA damage. The basic circuitry is shown in figure 7-8. 
When DNA is damaged by irradiation or other means, 
single-strand oligonucleotides are formed as breakdown 
products. These activate a coprotease activity of the RecA 
protein (which also mediates homology-dependent recom¬ 
bination). This coprotease activity promotes cleavage of a 
master control protein, LexA. RecA is considered a copro¬ 
tease rather than a protease because the LexA protein 
can undergo spontaneous cleavage (autodigestion) under 
nonphysiological conditions, such as extreme pH, in the 
absence of RecA. LexA represses, directly or indirectly, a 
large number of genes whose products facilitate repair and 
recovery from DNA damage. So when LexA is cleaved in 
response to DNA damage, these proteins are synthesized. 
Many phage repressors (including As) mimic LexA in their 
susceptibility to coproteolysis by activated RecA. When 
repressor is thus inactivated, lytic development ensues in 
almost every cell of a treated culture. Spontaneous phage 
production is RecA-dependent and therefore presumed to 
result from rare sporadic DNA damage, such as may occur 
when a replication fork occasionally stalls. 
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Figure 7-8 The bacterial SOS system and its relation to 
induction of lytic development. DNA damage causes the 
disintegration of some DNA to smaller degradation prod¬ 
ucts, of which single-stranded DNA is especially effective. 
Single-stranded DNA activates the RecA protein to a form 
RecA* whose coprotease activity accelerates the sponta¬ 
neous cleavage of the LexA protein into two inactive 
polypeptides. The normal function of LexA is to repress 
(either directly or indirectly) the transcription of damage- 
inducible (din) genes of the host, many of which function 
in DNA repair. The repressor proteins of many phages 
(including 7) mimic LexA, and their cleavage is likewise 
stimulated, inducing lytic development. 


The repressor concentration is delicately poised. It is 
sufficient to control phage gene expression, but too much 
repressor impedes induction. Excess repressor synthesis is 
avoided because, although repressor stimulates transcrip¬ 
tion from pM at moderate concentrations, high repressor 
concentrations repress transcription from pM. 

If a lysogenic cell is externally infected by another phage 
of the carried type, the repressor present in the cell binds to 
the operator sites on the entering phage and prevents it from 
initiating lytic development. The lysogenic cell is said to 
be immune. Repressor (and hence immunity) are specific. 
A cell lysogenic for X can be lytically infected by unrelated 
phages such as Mu-1, and vice versa. Among the natural 
relatives of X, there are a variety of repressor types; thus X 
and 434, which insert at the same chromosomal site, make 
different repressors, and each phage makes plaques on 
lysogens of the other. 

The circuitry has variations, even among close relatives 
of X. Phage P22, for example, encodes, in addition to a repres¬ 
sor, an antirepressor protein that can bind to repressor 
and neutralize its activity. Antirepressor synthesis is 
turned off in lysogenic cells by a “maintenance” protein, 
Mnt, which represses transcription of the antirepressor 
gene. Phages such as PI that lysogenize by plasmid forma¬ 
tion also control most of their genes by repression. Their 
problem is inherently more complex than that of X or Mu-1, 
because the lysogen must express not only the repressor 
gene but also those genes needed for plasmid replication 
and partitioning. 

Although there are many examples of latency among 
eukaryotic viruses, there are few close analogues of lyso¬ 
geny. Retroviruses, for example, insert their DNA and 


can replicate as part of a host chromosome, but they 
carry out a productive cycle of infection while inserted. 
When retroviruses are carried permanently in an inactive 
state, it is generally because either the virus has mutated 
to a defective form or the particular host cell type is unable 
to support a complete infectious cycle. The closest thing 
to lysogeny occurs with some of the nonintegrated viruses, 
such as papilloma and members of the Herpes family. In both 
these cases there seems to be a programmed potentiality 
either to be carried as a plasmid (analogous to PI) or to 
initiate a productive infection that culminates in virus 
liberation and cell death. In some of the herpesviruses 
this occurs in nondividing nerve cells, where the plasmid 
form need not replicate to persist; however, Epstein- 
Barr virus can establish a carried state in replicating 
lymphocytes. 


Repressor Control During Lysogenization 

Infection by a temperate phage has an indeterminate out¬ 
come. Some cells in the culture become lysogenic, whereas 
others are channeled into lytic development. If 100% of 
the cells gave a lytic response, the phage would be virulent. 
If 100% gave a lysogenic response, the phage would not form 
plaques and would not be readily recognizable as a phage. 
How these dual potentialities are achieved is best under¬ 
stood in X. Chapter 8 covers present understanding of how 
the lysis/lysogeny decision is made in cells infected by X 
A major player is the ell protein. Mutations inactivating 
the ell gene strongly reduce the frequency of lysogeniza¬ 
tion, although the few lysogens formed are stable, ell is 
needed only for the establishment of lysogeny, not for its 
perpetuation. 

ell is a transcriptional activator that coordinates the 
various aspects of lysogeny. To become lysogenic, a cell 
must not only establish repression of lytic genes, it must 
also integrate into the chromosome, and it must minimize 
the consequences of any lytic gene transcription initiated 
before the cell commits to lysogeny. There is no advantage 
to any one of these processes in absence of the others. It is 
futile to integrate into the chromosome if the cell will lyse 
for lack of repressor. Repression is established by transcrip¬ 
tion of the cl gene from a cll-dependent promoter within the 
ell gene. This promoter is activated only in the transient 
period during lysogenization. Once repression is established, 
the ell gene is itself repressed, and cl is transcribed only from 
the maintenance promoter pM. 

Likewise, integrase is transcribed from a cll-dependent 
promoter (pi) with a start site within xis. Thus integrase, 
but not excisionase, is made in large amounts during lyso¬ 
genization. Following induction, on the other hand, int and 
xis are coordinately expressed in the pL transcript. 

A third cll-dependent promoter, pAQ, makes an antisense 
transcript that opposes expression of the 0 gene, whose 
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product stimulates late gene expression through antiter¬ 
mination. Therefore late gene products, which could kill 
and lyse the cell, are not made by cells expressing cIL 
Hence cells expressing ell, and only those cells, are 
effectively channeled into lysogeny. Thus in X the indeter¬ 
minacy of the outcome of infection depends completely on 
fluctuations in cIL production. Other phages may employ 
very different mechanisms but, in order to be temperate, 
they must achieve the same end result. 
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Gene Regulatory Circuitry of Phage A 

JOHN W. LITTLE 


T he gene regulatory circuitry of phage X was the first 
complex regulatory circuit to be analyzed in molecular 
detail. X was chosen as a model system for this purpose 
because it displays a wide range of interesting regulatory 
properties. It can adopt two alternative life-styles, the lytic 
and lysogenic states, and it can switch from the lysogenic 
state to the lytic state in the process of prophage induction, 
often called the “genetic switch" (13, 28). A choice is made 
between these two alternative pathways soon after infection; 
the lysogenic state, in particular, is highly stable, and the 
switch can be highly efficient. It was clear early on that 
these properties would be of general relevance to other 
organisms, including higher eukaryotes. X proved a fortu¬ 
nate choice as an experimental system: its genetics is easy, 
and many of the regulatory proteins are also biochemically 
tractable, allowing a highly productive interplay between 
genetics and biochemistry. As a result of work by many 
investigators, the proteins and cis-acting sites involved in 
these regulatory events are well known and intensively 
studied, and most of their regulatory interactions are 
known in great mechanistic detail. Hence, X remains the 
best-understood complex regulatory system. At the same 
time, new surprises continue to appear (12). 

The primary purpose of this chapter is to outline the 
molecular events controlling the various regulatory deci¬ 
sions and states in the X life cycle; to indicate areas of interest 
for future investigation; and to draw parallels between 
events in X and other systems. 

X Background 

After infecting its host, X can follow either of two mutually 
exclusive pathways (Figure 8-1). In the lytic pathway, which 
resembles the life cycle of nearly all bacteriophages, the 
infected cell follows a temporal program of gene expression. 
Early gene products allow replication of viral DNA; late gene 
products include virion structural proteins, leading to 
assembly of mature virions, and proteins that cause cell 
lysis. In the lysogenic pathway, expression of the lytic genes 
is blocked by a protein called Cl (also known as X repressor); 


the viral DNA is integrated into the genome of the host by 
the action of the phage Int protein and the host IHF protein. 
In this form the phage is termed a prophage, and the viral 
DNA is replicated as part of the host DNA, offering the 
phage an alternative way to propagate. In the lysogenic 
state, Cl continues to be expressed, and it represses expres¬ 
sion of the lytic genes. This regulatory state is extremely 
stable, but it can be switched to the lytic state when the host 
SOS response is induced. 

Hence, X exhibits many interesting regulatory features, 
most of which were investigated for the first time in X. Under¬ 
standing these requires answers to several central questions. 
First, a regulatory decision is made between two alterna¬ 
tive and mutually exclusive pathways. How is that decision 
made, and how does it respond to cellular physiology? 
Second, the lysogenic pathway leads to a highly stable regu¬ 
latory state. How is that state stabilized? Third, the lysogenic 
state can break down. What are the events that lead to 
switching, and how do they fit with the needs of the phage? 
It is simplest to begin with the second question. 


Stabilization of Regulatory States 

As a starting point for understanding the regulatory events, 
we consider the regulatory states of the X system in light of 
two rules involving two central regulatory proteins, the Cl 
and Cro proteins: If the cells contain Cl and no Cro, they are 
in the lysogenic state, and they continue to make Cl and no 
Cro. If, on the other hand, the cells contain Cro but no Cl, 
they are in the lytic state, and they continue to make Cro 
and no CL The primary exception to this latter rule occurs 
during the lysis-lysogeny decision (see below). According to 
these rules, each regulatory state perpetuates itself, as we 
now describe. For a detailed review see Ptashne (28). 

Most of the regulatory events that stabilize the 
regulatory states occur at a complex regulatory region 
termed the 0 R region (Figure 8-2D). As will become clear 
below, a newly discovered long-range interaction occurs 
between 0 R and another site, the 0 L region (Figure 8-2B), 
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Figure 8-1 Phage X life cycle. At the top is depicted an infected cell; when linear X DNA is injected it cyclizes through 
cohesive termini, giving a circular molecule depicted at the left. The lysis-lysogeny decision is made 10-15 minutes after 
infection. See text for details. 
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Figure 8-2 Maps of phage X. The maps are to scale, as indicated; not all genes are shown. A: Map of the right half of the X 
genome. Genes mentioned in the text are shown; in addition, att is the site of site-specific recombination for integrating and 
excising the prophage; xis is required for prophage excision; S and R are involved in cell lysis; and Q is an antitermination 
protein for transcripts initiating at the late lytic promoter P R ', which otherwise terminate before S. Transcripts from the P|, 
P L , P R , and P R ' ( also known as pi, pL, pR, and pR', respectively) promoters are shown. B: Close-up of the 0 L region; this region 
is organized similarly to the 0 R region (also known as oL and oR, respectively), except that it lacks a oppositely directed 
promoter analogous to P RM . C: Close-up of the region around 0 R . Locations of the P R , P RM , and P RE promoters and their 
transcripts are shown, cro is translated from the P R transcript, not from the P RE transcript. D: Extreme close-up of the 0 R 
region. Only the initial portions of the cl and cro genes are shown. The P RM message starts with the AUC start codon of cl, and 
lacks a Shine-Dalgarno sequence. See figure 27-2 for a more detailed map of the phage X genome. 
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Figure 8-3 Molecular basis for bistability of X circuitry. A: The 0 R region is repeated from Figure 8-2D. B: Occupancy pattern 
of 0 R operators at moderate Cl levels. Cl binds as a dimer, and dimers at 0 R 1 and O r 2 make cooperative interactions. 
Contrary to this cartoon, structural evidence suggests that both subunits of a dimer touch the other dimer; evidence with 
LexA suggests that the “hinge” region connecting the dimerization and DNA-binding domains is shorter than implied here. 
C: Occupancy patterns of 0 R operators at moderate levels of Cro. 



and this interaction somewhat complicates our understand¬ 
ing of the events at 0 R . Nonetheless, the events at the 0 R 
region determine the regulatory state of the system. 

The 0 R region contains five sites to which proteins bind: 
two promoters, termed P RM and P R , and three binding sites 
for regulatory proteins, 0 R 3, O r 2 and 0 R 1. The two regula¬ 
tory proteins, Cl and Cro, bind to these three sites, but the 
consequences of binding differ, as detailed below. 

P RM and P R are expressed during the lysogenic state and 
the lytic state, respectively. Each promoter expresses one of 
the two regulatory proteins. Expression of P RM leads to 
synthesis of CL Expression of P R leads to synthesis of Cro, 
as well as to synthesis of several lytic gene products (0, P 
and 0), which we will not consider further. Hence, the rules 
given above can be restated: Cl allows expression of P RM 
but not P R ; Cro allows expression of P R but not P RM . That 
is, in each case the regulatory protein expressed by the 
promoter serves to stabilize the regulatory state that allows 
its expression. 

To understand the reasons for this, we next consider the 
detailed organization of the 0 R region (Figure 8-2D and 
Figure 8-3). As stated, both Cro and Cl bind to three bind¬ 
ing sites in this interval. However, binding of the two 
proteins differs in several ways, and has markedly differing 
consequences. Cro protein, which is simpler than Cl, 
can bind to 0 R 3, 0 R 2, and 0 R 1, but it binds most tightly 
to 0 r 3. Hence, at moderate Cro levels, 0 R 3 is occupied 
but 0 r 2 and 0 R 1 are free (Figure 8-3C). When Cro binds 
to 0 r 3, it completely represses P RM » since this promoter 
overlaps 0 R 3. However, in this configuration Cro has no 
effect on P R , which does not overlap 0 R 3. As Cro levels 
rise further, Cro can bind to 0 r 2 and/or 0 R 1, repressing 
P R ; hence the level of Cro is self-limiting due to negative 
autoregulation. 

The case of Cl is more complicated. Cl binds tightly to 
0 R 1 but weakly to 0 R 2 and 0 R 3. Cl has an additional 
feature, however, which leads to occupancy of 0 R 2: Cl binds 


cooperatively to adjacent binding sites. That is, occupancy of 
one site increases the affinity of the protein for an adjacent 
site roughly 200-fold (5). Since 0 R 1 is a tight binding site, this 
leads to much tighter binding to 0 R 2. In addition, coopera¬ 
tive binding by X Cl is “alternate pairwise”: some feature of 
the protein-DNA complex prevents a Cl dimer at a site from 
interacting simultaneously with dimers at two flanking 
sites. When 0 R 1 is mutated, Cl can bind cooperatively to 
0 r 2 and 0 R 3, and the favorable free energy for this interac¬ 
tion is about the same as for cooperative binding to 0 R 1 and 
0 r 2. However, a dimer at 0 R 2 cannot interact both with 
ones at 0 R 1 and 0 R 3; since binding to 0 R 1 is intrinsically 
tighter than that to 0 R 3, occupancy of 0 R 1 and 0 R 2 is 
favored (Figure 8-3B). 

Occupancy of 0 R 1 and 0 R 2 has two different conse¬ 
quences. First, it leads to repression of P R , which overlaps 
these two sites. Second, it leads to activation of P RM : when 
0 r 2 is occupied by Cl, P RM is about 10 times stronger than 
when 0 r 2 is vacant. This positive autoregulation of Cl expres¬ 
sion strongly reinforces the lysogenic state—that is, if Cl is 
present it continues to be made, and it stimulates its own 
expression. If Cl levels rise above a certain level. Cl begins 
to bind to 0 R 3, partially shutting off P RM . It was long believed 
that this negative autoregulation was unimportant in a 
X lysogen, but recent evidence shows that such negative 
feedback does occur in X lysogens, and that it involves a 
long-range interaction with the 0 L region (see below). 

These observations suffice to explain the rules given 
above: Cl allows expression of P RM by stimulating it, and 
prevents expression of P R by steric hindrance. At moderate 
levels, Cro allows expression of P R because it does not bind 
to sites overlapping P R , but represses P RM by binding to a 
site overlapping this promoter (Figure 8-3). 

Although this account describes how the lysogenic 
state is stabilized, it does not explain the way in which it is 
established. Immediately after infection of a host cell by X, 
neither Cl nor Cro is present. Hence P R will be expressed 
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(P RM is a weak promoter in the absence of Cl), leading to Cro 
expression. How can a cell ever achieve the regulatory state 
of Cl present and Cro absent? The answer, described further 
below, is that Cl can be expressed from a second promoter, 
termed Pre (Figure 8-2C), which is controlled in a different 
way in response to the physiology of the cell. If an infected 
cell is destined to follow the lysogenic pathway, P RE is 
expressed, leading to very high levels of Cl; this overrides 
the controls described above and shuts off expression of 
both P R and P RM for several generations, until Cl levels drop 
due to dilution of cell contents during growth, allowing 
expression of P RM . 

Unlike the lysogenic state, the lytic state cannot be and 
does not need to be highly stable, since cells following the 
lytic pathway lyse in about an hour. In certain mutants 
that block lytic growth, however, the lytic state can be 
stable (10, 26). In most studies, these mutants had a tempera¬ 
ture-sensitive cl allele, cI857, and defects in the N antitermi¬ 
nation protein and in a DNA-replication protein (0 or P). 
When lysogens are grown to 40°C to inactivate Cl, then 
shifted to 30°C, they remain in an “anti-immune” state for 
many generations; this state is not observed if the prophage 
also has a cro mutation. Hence, in this state, Cro has estab¬ 
lished dominance over Cl. The anti-immune state is not 
completely stable; after many generations, a fraction of 
the cells switch back to the immune state. 


Prophage Induction: The Switch from 
the Lysogenic State to the Lytic State 

Although the lysogenic state is extremely stable, lysogens 
can switch to the lytic state. They do so by taking advantage 
of a host regulatory system termed the SOS response (11, 21). 
In response to conditions that damage DNA or inhibit DNA 
replication, perhaps 40 SOS genes are expressed at elevated 
levels. The products of these genes help the cell cope with the 
DNA damage in various ways. The SOS response involves the 
interplay of two regulatory proteins: RecA and LexA. During 
normal growth, LexA partially represses the SOS genes. 
After inducing treatments, the RecA protein is activated to 
a form, termed RecA*, which catalyzes the specific proteo¬ 
lytic cleavage of LexA. This reaction is unusual, in that 
RecA acts indirectly as a coprotease to stimulate an inherent 
self-cleavage activity of LexA (19, 20). Recent crystal struc¬ 
tures of several mutant LexA proteins indicate that this 
activity of LexA is regulated by a conformational change 
(23). In the noncleavable conformation, the cleavage site 
is distant from the active site that catalyzes cleavage; it is 
likely that LexA is usually in this conformation. In the 
cleavable conformation, by contrast, the cleavage site and 
active site are juxtaposed. It is likely that RecA somehow 
stabilizes this conformation. 

A Cl is cleaved in an entirely parallel reaction. Structural 
data (2) show that the active site is closely similar to that 


of LexA; the cleavage site is not present in these structures. 
Cl has a slow self-cleavage activity, and RecA stimulates 
this reaction. Hence, after conditions that induce the SOS 
response, A Cl is also cleaved. The SOS response is a rapid 
response; cleavage of LexA is nearly complete by 5 minutes 
after UV irradiation. Since cleavage of Cl is far slower, it takes 
about 30 minutes for Cl levels to drop low enough to dere¬ 
press the lytic P R promoter; once P R is derepressed, Cro is 
made, which then binds to 0 R 3, probably preventing further 
synthesis of Cl and making the switch irreversible. 

Cleavage of Cl is much slower than that of LexA, both 
because dimers of Cl are resistant to cleavage and because 
the rate is intrinsically slower. It is believed that this dispar¬ 
ity in rates reflects the biological niches of the two repres¬ 
sors: LexA is designed for a rapid and reversible response to 
inducing treatments, while Cl has likely evolved to allow 
efficient prophage induction only at doses of DNA damage 
that are likely to kill the host cell (20). 

In addition to prophage induction caused by overt induc¬ 
ing treatments, cultures of lysogens contain some free 
phage particles. These are released in a process that also 
requires the SOS response, as shown by two lines of evi¬ 
dence: very low levels of free phage are seen in cultures of 
host recA mutants, and in recA + lysogens of A containing a 
cl iud~ mutation, which blocks Cl cleavage (22). This process 
is often termed spontaneous induction. Although it can be 
argued that this is a misnomer since the investigator does 
nothing to induce the process, the process does result from 
induction of the SOS response, and the term reflects the 
belief that the switch to the lytic state results from induction 
of the SOS pathway, whatever the stimulus may be. 

Recent evidence points toward a likely source for an indu¬ 
cing signal (7). In normally growing cells, replication forks 
frequently encounter single-strand breaks or noncoding 
lesions, leading to breakdown of the replisome. It is plausible 
that such events generate a transient activation of RecA; 
in this view, rare cells have enough damage to keep RecA 
activated long enough to cleave Cl, leading to switching. 

Role of 0 L 

Recent evidence indicates that the 0 R region is not a self- 
contained regulatory module. It has long been known that 
Cl and Cro both bind to a second regulatory region, termed 
the 0 L region. In the lysogenic state, binding of Cl to 0 L is 
necessary to repress expression of the lytic P L promoter, 
which codes for the N protein, an antitermination protein 
essential for the lytic program of expression. Several new 
lines of evidence indicate a second role for binding of Cl to 
0 L : Cl bound to 0 L can cooperatively interact with Cl bound 
at 0 R (Figure 8-4). 

First, in P R ::lacZ fusions, presence of 0 L at a site distal to 
lacZ leads to increased repression of lacZ (31). This inter¬ 
action requires only two operator sites at 0 R and two sites 
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Figure 8-4 0 R :0 L interaction. One possible set of interactions 
between Cl bound at 0 L and 0 R . On the left is shown an 
interaction involving dimers bound at each of four sites. The 
intervening 2.4 kb form a loop (indicated by the dashed arc). 
This structure would form from a non-looped structure (not 
shown) by cooperative interactions between tetramers at 
each site. Both P L and P R are presumed to be repressed in 
this structure, as they would be in the absence of looping 
when Cl binds. On the right is a complex containing 
additional dimers bound to 0 R 3 and O l 3; in this complex, 
P RM is repressed. Other complexes may also form; for 
instance, the 0 L region might be inverted from the 
orientation. 


at 0 L . Second, Cl can form loops between 0 L and 0 R , as 
observed by electron microscopy (31). Third, the lysogenic 
Prm promoter can also be repressed more efficiently by Cl 
when 0 L is distal to the lacZ reporter gene: this effect is not 
seen in the presence of an O r 3 mutation blocking Cl binding, 
indicating that P RM repression involves Cl binding to O r 3 (8). 
In addition, lysogens of this O r 3 mutant are more difficult to 
induce by UV light than wild-type. It is suggested that a 
complex forms involving dimer-dimer interaction between 
Cl at O r 3 and at O l 3. Fourth, O l 3 mutations make it more 
difficult to induce X by UV indicating a role for O l 3 and 
further supporting the model of an O r 3:O l 3 interaction 
(C. B. Michalowski and J. W. L, unpublished data). 

The primary interaction between 0 R and 0 L is likely 
mediated by protein-protein interactions between tetramers 
bound to the two regions, at 0 R 1 and O r 2 and at 0 L 1 and 0 L 2, 
respectively (Figure 8-4). It is known that Cl can form octa- 
mers in solution, and a recent crystal structure (3) of the 
C-terminal domain shows the structure of the octamer. In 
this view, tetramer-tetramer interaction holds 0 R and 0 L 
together: weak binding of dimers to 0 R 3 and 0 L 3 is strength¬ 
ened by cooperative interaction between the two dimers 
(Figure 8-4). However, it remains plausible that several 
different types of complexes can form; for instance, perhaps 
0 R can approach 0 L in two orientations, which would likely 
allow different complexes to form. In addition, if multiple 
complexes exist, they may interconvert readily. 


Whatever the details of this interaction, it has several 
regulatory consequences, which are beginning to be under¬ 
stood. First, since it is a cooperative interaction, it probably 
strengthens binding of Cl to 0 R and to 0 L , conferring more 
complete repression of the lytic P R and P L promoters during 
the lysogenic state. Second, as stated it leads to substantial 
negative autoregulation of cl at the levels of Cl found in a 
lysogen, ensuring a more constant level of CL Finally, it is 
tempting to speculate that cooperative binding provides 
a way to coordinate derepression of the lytic promoters 
during the process of prophage induction. In this view, as Cl 
levels drop due to cleavage, when the long-range interaction 
was broken, binding of Cl to 0 R and 0 L would also be 
weakened, causing Cl to fall off both sites at the same time, 
ensuring simultaneous derepression of P R and P L . One line of 
evidence compatible with this is the finding that the phage 
XO r 3'23' has a substantially reduced burst size after UV 
induction (22), despite normal lytic growth; perhaps the 
alterations of both 0 R 1 and 0 R 3 in this variant disrupt the 
normal interaction between 0 R and 0 L . 

Lysis-Lysogeny Decision 

As described above, Cl must be present to stimulate expres¬ 
sion of P RM in the lysogenic state. In a cell that is following 
the lysogenic pathway, the initial burst of Cl is provided by 
expression of another promoter, P RE (9, 16). Regulation of 
this promoter involves physiological events that are not 
fully understood. The direct positive regulator of Pre is the 
phage protein CII, which binds at P RE (Figure 8-2 C) and acti¬ 
vates its expression. The level of CII in the cell is in turn 
under complex control. CII is degraded by a host protease, 
FtsH (or HflB) (18, 32), and the action of FtsH is antagonized 
by another phage protein termed CEIL CIII apparently acts as 
a competitive inhibitor of FtsH (14). It is unclear whether 
physiological conditions also influence the activity of FtsH. 
Infection of cells with multiple phages allows more efficient 
lysogenization, and it is believed that the high gene dosage 
provides more CII and CIII, stabilizing CII. Even under these 
conditions, however, not all the infected cells follow the 
lysogenic pathway. 

The lysis-lysogeny decision is made about 10-15 minutes 
after infection (9). During this time, expression of the early 
lytic genes N, 0, and P, and the ell and till genes, has 
occurred. It is likely that the decision is only made after this 
time to allow CII, CIII, FtsH. and perhaps another factor, 
HflKC (see below), to interact in such a way as to allow 
sensing of the physiological state of the cell. In a cell follow¬ 
ing the lysogenic pathway, CII also activates expression of 
the Int protein, which is required for integration of the viral 
genome into that of the host. 

The lysis-lysogeny decision has been simulated by a 
computer model (25) that involves all the known compo¬ 
nents (but not the 0 L :0 R interaction). This simulation 
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mimics the in vivo behavior of multiply infected cells, in that 
not all simulations lead to the lysogenic response. The 
computer model is a “stochastic” model, in which expression 
of promoters, translation and degradation of mRNAs, and 
degradation of proteins occurs not at a fixed rate but with a 
given probability per unit time. Different simulations give 
widely varying levels of gene products; in particular, the 
level of CII varies markedly from one run to another. The 
authors ascribe the variable response of the model to this 
stochastic behavior, and note that this type of stochastic 
behavior should have the most pronounced effects in cells 
with small numbers of molecules. 

Biochemical analysis of CII degradation has been compli¬ 
cated by several factors. First, ftsH is an essential gene, 
complicating genetic analysis. Second, FtsH is a membrane- 
bound protein, making biochemical analysis difficult. 
Third, its activity in vivo is apparently modulated by inter¬ 
action with the HflKC complex. Mutations inactivating 
the HflKC complex also favor lysogeny, suggesting that 
its interaction with FtsH modulates the activity of FtsH. 
Fourth, HflKC is also membrane-bound, and the portion not 
embedded in the membrane lies primarily in the peri- 
plasmic space. This greatly complicates in vitro reconstruc¬ 
tion of its interaction with FtsH, since the nonmembrane 
portions of HflKC and FtsH lie on opposite sides of the 
inner membrane. 


What Is Essential, and What Is 
Refinement? 

Are the detailed features of the X regulatory circuitry essen¬ 
tial for its proper operation? Or are they refinements to a 
basic ground plan? This issue is important for evolution of 
complex gene regulatory circuitry. If a circuit needs all its 
“bells and whistles” to work at all, then one would expect 
that it would arise during evolution only with difficulty. If, 
by contrast, a stripped-down version of the regulatory circuit 
would suffice in the initial stages of evolution, and the 
refinements were added later, the circuit would seem far 
more likely to arise (22,29). In this two-step model, the initial 
version would have to offer a selective advantage, such as 
the lysogenic life-style; the later refinements would confer 
optimal behavior. 

Relatively limited analysis of other lambdoid phages, such 
as 434 and HK022, suggests that these phages have similar 
behaviors, such as positive autoregulation of Cl, cooperative 
DNA binding by Cl, and differential occupancies of the 0 R 
operators (6, 28, 34). With this limited data set, we can infer 
that these properties are generally advantageous to a lamb¬ 
doid phage, and have been selected during evolution. In 
addition, since the repressors have different specificities, at 
least some of these features likely evolved independently in 
different immunity specificities. 


Several recent lines of evidence suggest that a two-step 
model is compatible with the properties of the X circuitry. 
First, the differential affinities of Cro and Cl for the various 
operators in 0 R are not an essential part of X regulation, 
since 0 R 1 and 0 R 3 can be made identical and the phage 
exhibits near-normal behavior (22). These two sites differ in 
three positions. Three variant phages were made in which 
the operators at the sites of both 0 R 1 and 0 R 3 have the 
same sequence, either those of 0 R 1, O r 3, or a hybrid site 
differing from 0 R 3 by one change. All these variants are 
able to grow lytically, readily form stable lysogens, and lyso- 
gens are readily induced by UV They show quantitative 
differences, but their qualitative operation is like that of the 
wild-type. At the mechanistic level, the behavior of these 
phages can be rationalized in terms of the known properties 
of Cl, Cro, and the 0 R region. In particular, in a lysogen it is 
likely that Cl frequently alternates between cooperative 
binding to 0 R 1 and O r 2 (as in the wild-type, Figure 8-3B) 
and to 0 r 2 and 0 R 3 (not shown), repressing both promoters; 
the sum of these two occupancy patterns is P RM ON and P R 
OFF. In any case, although the differential affinities shown 
by Cl and Cro for the 0 R operators appear to make the circuit 
work better, they are likely a refinement. 

This is an example of a property termed “robustness,” 
which means that variations in the parameters of a system 
still allow its normal or near-normal function. Robustness 
facilitates a two-step pathway for evolution, since it permits 
a wider range of circuits to arise initially (1, 22). 

Second, cooperative binding of Cl is not essential. 
We have found that a cooperativity mutant in cl (Y210N), in 
combination with two mutations in the 0 R region, exhibits 
near-normal behavior (A. C. Watson and J.W. L., unpublished 
data). The Y210N mutant has been shown to be defective in 
cooperative binding to adjacent sites in vivo (35), and 
presumably it blocks the interaction between 0 L and 0 R as 
well. Again, these data strongly suggest that cooperativity 
is a refinement rather than an essential feature. 

Third, positive autoregulation by Cl at P RM is not essential 
(C. B. Michalowski and J. W. L., unpublished data). Two 
mutations in cl, D38R and D38N, were previously shown 
to prevent P RM activation in a model system. When crossed 
onto a phage, D38R does not allow lysogenization, but it 
can be suppressed by mutations in P RM . A phage carrying 
D38N can form extremely unstable lysogens, but changes 
in P R m allow it to form lysogens about as stable as wild- 
type. These P RM mutations do not restore positive control 
by D38R or D38N mutant Cl. This analysis suggests 
that positive autoregulation of cl expression is also a 
refinement. 

With respect to the role of CII in regulation, at least one 
phage, 21, does not have an identifiable P L promoter for CII- 
dependent int expression (36). In addition, in X the CIII 
function is dispensable in the presence of a particular CII 
mutant allele, can-1 (16). These findings suggest that aspects 
of the CII system are refinements as well. 
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Generalizations to Other Regulatory 
Systems 

Many of the regulatory principles described above were first 
established by studies on X, and all are generally applicable 
to a wide range of regulatory circuits. 

Establishment and Maintenance 

The principle that a regulatory decision is first made in an 
establishment phase, then ratified and made permanent in 
a maintenance phase, is observed in a variety of systems. 
Perhaps the best studied case is that of Sxl in Drosophila. In 
the establishment phase, Sxl is expressed from the Pe promo¬ 
ter in females but not in males (17). Translation of the Pe 
transcript makes active Sxl protein. In the maintenance 
phase, the Sxl gene is expressed from a different promoter, 
the Pm promoter, both in males and in females; however, 
the following positive autoregulatory loop ensures that Sxl 
protein is made only in females. The Sxl protein is required 
for proper splicing of the latter rnRNA, and only in its pre¬ 
sence can the Pm transcript be spliced properly. Accordingly, 
in females the presence of Sxl protein made in the establish¬ 
ment phase allows more Sxl protein to be made in the 
maintenance phase; in males, the Pm transcript cannot be 
properly spliced, and no Sxl protein is made. As another 
example, Drosophila proteins in the Polycomb group act to 
maintain patterns of gene expression established by other 
mechanisms, doing so by maintaining repressive chromatin 
structures (4). 

Multiple Promoters 

The case of P RE and P RM was the first instance in which two 
different promoters were shown to drive expression of the 
same gene (30). Importantly, expression of each of these 
promoters has a different regulatory meaning, as described 
above. Perhaps the best-analyzed eukaryotic system that 
reflects these principles is that controlling expression of 
“pair-rule” genes in Drosophila. In a well-studied case, the 
eve gene is expressed in seven stripes in the early embryo 
(33). Each of these stripes appears to be under separate 
control, and several of them are regulated by a discrete 
region of the eve regulatory region; for instance, the stripe 2 
enhancer contains binding sites for four proteins, most of 
which are products of “gap” genes, and the proper combina¬ 
tion of these proteins is found only in the region in which 
stripe 2 is expressed (33). This stripe 2 enhancer can be sepa¬ 
rated from the rest of the eve upstream region, and it func¬ 
tions autonomously. 

In this system, then, there are seven different sets of 
inputs that give the common output of eve expression. 
These seven inputs correspond to conditions occurring in 


seven regions or zones along the anterior-posterior axis of 
the embryo. The common output establishes seven zones of 
expression of the same regulatory protein. Concurrent regu¬ 
lation of several other pair-rule genes establishes a pattern 
that repeats seven times in the embryo, dividing it up into 
seven regions that then develop further in a pattern that is 
in many ways equivalent in each region. 

The logic of the eve system is the same OR logic as that 
first established in X: “Express the gene if condition 1 OR 
condition 2 OR ... condition 7 is met.” That is, any of seven 
different conditions suffices to turn the gene on. This logic 
can be contrasted with another well-analyzed example, 
that of HO regulation in Saccharomyces cerevisiae. in which 
the logic is “express the gene only if condition 1 AND condi¬ 
tion 2 AND condition 3 are met” (15). 

Alternative Stable Regulatory States 

Stable regulatory states are relatively rare in prokaryotes; 
most regulatory systems are rapidly reversible, allowing 
dynamic responses to changes in the environment. In Meta¬ 
zoa, stable regulatory states are universally used to establish 
and maintain particular cell types. The mechanistic details 
in higher eukaryotes are certainly more complex, in part 
since they involve heritable states of chromatin. 

Threshold Behavior 

Prophage induction in phage X exhibits threshold behavior; 
below a certain threshold level of DNA damage this process 
is rather inefficient, while above the threshold most or all of 
the cells switch their state (e.g., 22). Many examples of 
threshold behavior are known in eukaryotic regulation. 
Once again, the best-analyzed examples are in Drosophila. 
During early embryogenesis, several gap genes are 
expressed only above a certain threshold of maternal gene 
products (24, 27); for instance, zygotic Hb is made only 
above a certain threshold concentration of Bed protein. 


Conclusions 

The X regulatory circuitry is well understood at the mecha¬ 
nistic level. The behavior of mutants is generally consistent 
with the in vitro behavior of the regulatory proteins and 
their targets, and with mutant variants of these elements. 
As noted above, a relatively successful computer simulation 
has been developed of the lysis-lysogeny decision that relies 
on quantitative estimates for about 40 different parameters. 
This simulation can reproduce the lysogenization frequency 
of cells infected with various multiplicities of X. On the other 
hand, it does not predict the large overshoot of Cl synthesis 
from P RE , indicating the need for further refinement of the 
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model. Nonetheless, the success of this approach suggests 
that we know most of the elements involved in the lysis- 
lysogeny decision, and can make reasonable quantitative 
inferences about their properties and interactions. 

Several fruitful avenues remain for future research. The 
Cl-mediated interaction between 0 R and 0 L needs far better 
characterization, as regards both the nature of the com¬ 
plexes formed and the functional consequences of this inter¬ 
action. Further analysis of events in the lysis-lysogeny 
decision is important; among these are the detailed mecha¬ 
nisms of CII cleavage by FtsH, the modulation of FtsH activ¬ 
ity by other cellular proteins such as HflKC, the inhibitory 
action of CIII, and particularly the relationship between 
cellular physiology and the efficiency of CII cleavage. Finally, 
further analysis of simplified versions of the A circuitry, and 
comparison with the properties of other lambdoid phage 
circuits, should help lend insights into mechanisms by 
which complex circuits could plausibly evolve. 
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Regulation of X Gene Expression by 
Transcription Termination and 
Antitermination 

DAVID I. FRIEDMAN 
DONALD L. COURT 


R egulatory effects on gene expression at the level of 
transcription elongation are now recognized to be 
widely operative in both prokaryotes and eukaryotes (23,97, 
117, 123). Recognition of this mode of regulation can be 
dated to the 1969 Nature paper by Jeff Roberts (138). Not 
only did this work identify the Escherichia coli Rho termi¬ 
nation protein and propose that there are specific tran¬ 
scription termination sites in the phage X genome, but it 
led to the proposal that the X N transcription regulation 
protein promotes gene expression by allowing RNA poly¬ 
merase (RNA Pol) to transcend transcription terminators. 
The ideas first presented in this seminal paper have served 
as the basis for over 30 years of intensive work not only 
with phages but also with organisms on all levels of the 
evolutionary ladder. However, studies on X and related 
lambdoid phages have continued to be prime sources of 
information about the molecular events influencing tran¬ 
scription elongation. It is the purpose of this chapter to 
update a previous review of this subject (48). We will present 
a short summary of what we consider essential material 
from that review as background and then a more extensive 
review that focuses primarily on subsequent work that 
we think has significantly extended understanding of the 
process of gene regulation by transcription termination 
and antitermination. 

This chapter will focus on the lambdoid family, because 
the terminators and antiterminators of these phages have 
served as major tools in the study of transcription termina¬ 
tion and antitermination. The impact of the work on these 
phages is clearly evident from the large number of review 
articles published on the subject since the last review in this 
series. We point to these reviews because they undoubtedly 
have different emphases than will be found in this chapter 
(29, 30, 50,51,67,70,74,133,137,141,177). 


Note that we will use uppercase letters to indicate RNA 
(e.g., NUT) and lowercase italics (e.g., nut ) to indicate DNA. 


Prelude: The X Story, What Came Before 

Note that in this section we rely on the reviews listed above 
for many of the references. In some cases, where we feel it is 
warranted, further appropriate references will be included. 

Transcription Patterns 

This discussion is based on the information contained in 
figure 9-1. Similar transcription patterns have been 
observed with all lambdoid phages that have been examined. 
However, the small differences between X and other 
members of the lambdoid family have provided important 
insights into the mechanism of regulatory processes that 
act during transcription elongation. These processes, tran¬ 
scription termination and antitermination, play important 
roles in the temporal regulation of transcription of the X 
genome (figure 9-1). Transcription terminators, sequences 
that direct dissociation of the ternary RNA Pol-DNA- 
mRNA transcription complex are found at strategic posi¬ 
tions on the genomes of lambdoid phages (indicated by t in 
figure 9-1). As in E. coli, these terminators can be either 
Rho-dependent or intrinsic (factor-independent). The N 
protein of X, with E. coli Nus factors, modifies the tran¬ 
scribing RNA Pol at NUT sites, creating a termination- 
resistant transcription complex. The NUT sites, recognized 
on the RNA transcripts, are located approximately 35 and 
approximately 225 bases downstream, respectively, of the 
transcription starts sites of the early X p L and p R promoters. 
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This N-directed antitermination allows full expression of 
genes downstream of transcription terminators in the p L 
and p R operons. The Q gene, encoding another antitermina¬ 
tion factor, is 5.8 kb beyond the p R promoter. Because there 
are at least four transcription terminators between the p R 
promoter and the 0 gene, N antitermination is required 
for 0 expression. The 0 antitermination protein acting at 
the late p R ' promoter, in turn, is required for full expression 
of the phage late genes. As we will see, the requirements and 
mechanics of N and 0 modification of the polymerase 
complex are quite different. 

The N Antitermination Protein 

The N gene (15) located immediately downstream of the 
early p L promoter encodes a 12.2 kDa protein (46, 72) that, 
as discussed, acts at NUT sites to modify RNA Pol to a termi¬ 
nation-resistant form. To a first approximation N proteins 
have specificity for their cognate NUT sequence, although 
when overexpressed in vivo, N proteins can act at noncog¬ 
nate NUT sites (47,150). 

Genetic studies suggested an interaction between N and 
RNA Pol. E. coli carrying a mutation (ran) in the rpoB gene 
(encoding the (1 subunit of RNA Pol) fails to support growth 
of X with certain mutations, mar, in the N gene (59). More¬ 
over an E. coli carrying another rpoB mutation, rpoB3595, 
supports growth of a X expressing a defective N (100), a 
finding consistent with the idea that an interaction with 
N normally modifies RNA Pol into a termination-resistant 
form (86). 

Host Factors 

Mutations in nusA, nusB, rpsj ( nusE ), and rho (nusD) that 
result in a failure of the host E. coli to support N-mediated 
antitermination identified genes whose products were 


likely to be involved in N action (54). A direct role for the 
Nus factors in N action was demonstrated in two ways. 
First, this was shown using an in vitro system with S30 
extracts derived from nus mutants and controls with 
plasmids expressing the wild-type allele (34). Extracts from 
E. coli derivatives with nus mutations failed to support 
N antitermination, while extracts from the same mutant 
bacteria with plasmids expressing the wild-type allele 
supported N antitermination. Second, a transcription elon¬ 
gation complex containing N and the Nus factors was 
identified and characterized using an in vitro system that 
requires S100 extracts (63, 81). The inability to develop an 
in vitro system with purified components that showed 
processive N-mediated antitermination indicated that all of 
the required factors had not been identified. However, 
N, NusA, and RNA Pol with a template that included a nut 
site was shown to be sufficient to form a minimal transcrip¬ 
tion complex that can read through a single downstream 
terminator (179). This led to the idea that NusA and N are 
the core components of the complex. 

These host factors (collectively referred to as Nus) sub¬ 
verted by the phage for N antitermination play important 
roles for the host in transcription and translation. Subse¬ 
quent to its identification as a component of the N-modifica- 
tion complex, the nusA gene product was shown to influence 
transcription pausing and termination in E. coli (133). The 
isolation of thermal-sensitive (118) and cold-sensitive (26) 
alleles of nusA identified NusA as an essential protein in 
E. coli. 

The nusB gene product was shown to influence both 
transcription termination and antitermination (49, 57, 
91, 172). The isolation of nusB null mutants that were 
cold-sensitive defined NusB as being essential at low 
temperature (166). The rpsj (nusE) gene (56, 173), encoding 
ribosomal protein S10, obviously plays a role in translation. 
Evidence for involvement of the rho gene product derived 
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from experiments with an E. coli derivative with the rho026 
allele. This allele was referred to nusD, because E. coli 
carrying rho026 as its only rho allele exhibits a slightly 
weakened ability to support N action (32). 

Second-Site Suppressors 

Mutations in E. coli and X have been obtained that suppress 
the effect of nus mutations on N action. Certain second-site 
mutations in the E. coli nusB gene as well as mutations 
in the X N gene and nut site enhance the action of N in 
E. coli carrying either the nusAl or nusE71 mutations (53, 
150,171). 

The NUT Site (see figure 9-1) 

Analysis of a cis dominant mutation that blocks the action 
of N in the p L operon led Salstrom and Szybalski (148) 
to identify the nut site in the p L operon. Based on this 
observation, the nut site in the p R operon was identified 
(146). Identification of a frameshift mutation in cro that 
allows translation past the normal translation stop in the 
cro gene and thereby blocks N action in the p R operon 
provided the first evidence that N action, at least in part, 
works through the NUT RNA (126). 

The NUT site has three elements: BOXA and BOXB, a 
stem-loop structure, and between them a spacer region 
(figure 9-2) (52). As will be discussed in detail below, the N 
and Nus proteins assemble on the NUT site of the nascent 
RNA and modify RNA Pol to a termination- resistant form. 
Transcription initiating at p R is modified by N as it passes 
through the nutR site, causing RNA Pol to transcend the 
tRl and nin terminators. This allows transcription of 
the downstream 0 and P replication genes, the nin region, 
and the Q gene. Similarly transcription from p L is modified 
by N as it passes through the nutL site, causing RNA Pol 
to transcend downstream terminators. 

Q and qut 

The Q gene encodes a 20 kDa protein that, like N, modifies 
RNA Pol to a termination-resistant form that is responsible 
for transcription of most of the X genome (78). Unlike N, 
which recognizes RNA sequences, 0 recognizes a DNA 
sequence called qut (159, 160, 185). A more complete 
discussion of 0 can be found below. 

Nun 

Phage HK022 differs from other members of the lambdoid 
family in not having an N-like function. Located at the 
position of the N gene, HK022 has the nun gene, encoding 
a protein that acts not with the HK022 RNA but with X 


RNA! It was observed that X fails to grow in an HK022 
lysogen even though HK022 grows in a X lysogen (136). 
This led to the idea that there is an immunity-independent 
(immunity describes the operator-repressor interaction) 
activity in HK022 that blocks X growth in the HK022 
lysogen. This exclusion is specific to phage with the X 
NUT site; other lambdoid phages having N-NUT systems 
that are heterologous to the N-NUT of X grow in an 
HK022 lysogen. In contrast to N, which uses NUT sites 
for antitermination, Nun uses X NUT sites to terminate 
transcription in regions of the X genome that do not 
contain terminators. Moreover, unlike either intrinsic or 
Rho-dependent termination, Nun-mediated termination 
requires the same Nus proteins used in N-mediated anti- 
termination. Thus Nun acts as a highly specific exclusion 
function, inhibiting, as far as we know, only those lamb¬ 
doid phages with the nut sites of X. 


The New Story 

The N Antitermination Protein 

Although the N antitermination complex is composed of a 
number of proteins and an RNA site, in vitro experiments 
provide evidence that N protein itself is the prime mover in 
establishing the termination-resistant transcription 
complex (37, 132). One can eliminate Nus factors (37), and 
even the NUT site, and efficient antitermination is still 
maintained (76, 132). However, for such factor- and site- 
independent action, N must be supplied at very high 
concentrations and the salt concentration must be low. 
Under these special conditions the modified polymerase is 
capable of antitermination of one terminator at a limited 
distance: but not the processive antitermination through 
multiple terminators observed only with the full complex. 
Binding studies show that the affinity of N for the NUT 
site is approximately 1000-fold higher than it is for an 
RNA without a NUT site (168). Thus, Nus factors and a 
NUT site on the RNA are significant contributors under 
physiological conditions to the efficiency and processivity 
of N-mediated transcription antitermination. 

N and its homologues are, to a first approximation, 
specific for their cognate NUT sites (47, 150). Functional 
regions of the N protein were first identified by Lazinski 
et al. (99) using hybrids formed between the N gene of X 
and N homologs from other lambdoid phages. Their study 
showed that the NUT site specificity was located in the 
amino portion of the N protein, an 11 amino acid region 
rich in Arg-residues. Lazinski et al. (99) pointed out that 
a number of RNA binding proteins had similar Arg-rich 
motifs or ARMs. The importance of the Arg residues was 
confirmed by a mutational analysis in which 14 different 
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Figure 9-2 Features of antitermination sites in early operons of lambdoid phages. A: Comparison of nut regions (L and R) of X 
and H-19B with components elements identified below (52, 120). The arrows indicate regions of hyphenated dyad symmetry 
that are sources of stem-loop structures. B: The PUT stem-loop structures of phage HK022 (6). C: The GNRA fold-like 
structure formed by X BOXB upon interaction with N, shown here for the NUTR-BOXB. Bases, sugars, and phosphates are 
represented respectively by rectangles (with base indicated within), ovals, and squares. Dashed lines shows hydrogen 
bonds and the —- line shows sheared G-6 A-10 base pair. Bases are numbered from the 3' end of the upstream stem to 
5' end of the downstream stem. The purine ring of adenine 7 stacks with the indole ring of Trp18 of N and that guanine 
9 is extruded from the structure. See thebacteriophages.org/frames_0090.htm for a color version of this figure. 


amino acids in the ARM region of N, including five Arg 
residues, were systematically replaced with different 
amino acids (45). Single changes at each of the five Arg 
residues interfered with N action, while similar changes 
at six other residues did not affect N action. The nature 
of the interaction between N and the NUT site is 
discussed below. 

Experiments from the Greenblatt laboratory defined 
functional roles of various regions of the 107 amino acid 
residue N protein (114). Affinity chromatography using a 
set of GST fusions to various fragments of N identified 
amino acids 34-47 of N as the NusA binding site. A series of 
experiments that included affinity chromatography, band 
shifts, and in vitro antitermination assays provided evidence 


that both regions near the amino (within residues 1-47) 
and carboxy (including residues past residue 73) termini of 
N are involved in binding RNA Pol. Despite having 
apparent functional domains, N appears unstructured in 
solution, but does become structured when it binds to 
NUT RNA (114,167). 


Host Factors 

(Note: Although the Nus factors have important physiologi¬ 
cal roles for the host bacterium, we will use the term Nus~ to 
describe the phenotype of a mutant nus allele that is defec¬ 
tive in support of N antitermination activity.) 
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NusA 

Genomic, structural, genetic, and functional studies collec¬ 
tively provide an integrated view of NusA action. Orthologs 
of NusA have been identified throughout the bacterial king¬ 
dom. Analysis of these sequences led to identification of 
three regions that have similarities with known RNA-bind- 
ing motifs: one SI (13) and two KH (60). In vivo studies 
coupled with in vitro studies characterizing nusA point 
mutations showed that the SI and KH domains are each 
involved in RNA binding. Moreover, these mutations 
influenced, to a greater or lesser degree, N-mediated anti- 
termination, and bacterial viability (115,191,192). These in 
vivo studies suggested an order of assembly of the N-Nus 
complex that is the same as that derived from in vitro studies 
(113, 179). The assembly begins with N and NusA binding 
to the NUT site in the newly synthesized NUT RNA, followed 
by the addition of the other Nus proteins (NusB, ribosomal 
protein S10, and NusG). 

A set of truncated NusA proteins were used in an in vitro 
study that associated particular regions of NusA with 
specific functional activities (109): (i) a NusA segment 
must include the SI and two KH domains to support N- 
mediated transcription antitermination, (ii) both the N- and 
C-terminal regions of NusA are important for RNA Pol 
binding, (iii) the amino region (amino acids 1-348) is suffi¬ 
cient for enhancing termination at an intrinsic terminator 
to levels observed with the complete NusA protein (495 
amino acids), 

The crystal structures of the NusA proteins from 
Thermotoga maritima (0.21 nm resolution) (183) and Myco¬ 
bacterium tuberculosis (0.17 nm resolution) (65) were deter¬ 
mined. Even though there are some differences, these 
structures support the idea of a continuous RNA binding 
structure that includes the SI and two KH domains as 
well as the amino portion of NusA. 

NusB 

The amino acid change of the commonly used nusB5 
(causes a Nus~ phenotype) and the suppressor nusBIOl 
(suppresses the effects of the nusAl and nusE71 mutations 
on N action) mutations have been identified; a Tyr to Gly at 
position 18 (NusB5) and an Asp to Asn at position 118 
(NusBIOl) (25). First identified by its role in N-mediated 
antitermination, NusB has more recently been character¬ 
ized for its role in expression of rrn operons (154, 161). 
NusB, in cooperation with NusE, was shown to bind to an 
RNA containing the consensus BOXA sequence (108, 122) 
and the NusB and NusE proteins interact even in the 
absence of RNA (66, 112). The solution structure of NusB 
as determined by nuclear magnetic resonance was shown 
to consist of two subdomains, each having an all a helical 
fold structure (3, 64). 


Ribosomal Protein S10 (NusE) 

The sequence change of the only identified mutation in 
rpsj (encodes ribosome protein S10) that affects N-mediated 
antitermination, nusE71, has been identified as an Ala to 
Asp change at codon 86 (25). Structural studies on the S10 
orthologue from Mycobacterium tuberculosis indicate little 
secondary structure (66). Both the NusB and S10 proteins of 
M. tuberculosis (66) and E. coli (112) have been reported to 
form a complex. This could indicate how the two proteins 
interact in the N antitermination complex. However, there 
is likely to be another role for S10, since even under condi¬ 
tions where NusB is not required, S10 (NusE) is required for 
N antitermination (129). 

NusG 

The original selection for mutations with a Nus~ pheno¬ 
type did not yield mutations in nusG. However, Gottesman 
and coworkers isolated a mutation in nusG, originally 
called U, that, like nusBIOl, suppresses the Nus~ pheno¬ 
type of the nusAl and nusE71 mutations (164). The first 
evidence indicating that NusG is required for N action 
came from in vitro studies of Greenblatt and coworkers 
(103, 104). Although a requirement for NusA, NusB, and 
S10 in N-mediated antitermination could be demonstrated 
in vitro, efficient N-mediated antitermination was only 
observed when the in vitro reaction mixture also included 
an S100 or S30 extract (34, 63). This suggested that another 
factor(s) in addition to the identified Nus factors was 
required for N action. Greenblatt and collaborators identi¬ 
fied the missing component as NusG by showing that puri¬ 
fied NusG with the previously identified Nus factors 
eliminated the requirement for the S100 (103). More 
recently, it has been shown that NusG is also required for 
N action in vivo (190). Because NusG is essential, bacteria 
having only a single inactivated nusG allele could not be 
used to assess the role of NusG in N action. To assess 
physiological activity in the absence of NusG, Gottesman 
and collaborators developed a system for creating bacteria 
depleted of NusG (163). Bacteria with the chromosomal 
nusG allele disrupted by insertion of a kan cassette (nusG::- 
kan) and an intact nusG gene carried by a plasmid tempera¬ 
ture-sensitive for replication are grown for many rounds of 
growth at an elevated temperature nonpermissive for repli¬ 
cation of the plasmid. This segregates nusGr.kan derivatives 
free of the plasmid and, after a number of rounds of growth 
in the absence of the plasmid, derivatives free of NusG are 
obtained. E. coli depleted for NusG in this way failed to 
support growth of X (190). However, X derivatives that 
carried either their own nusG gene or that were deleted 
for terminators obviating the need for N antitermination 
produced phage bursts 30 times greater than did X infect¬ 
ing E. coli depleted for NusG. Thus, the reduced burst of X 
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in the NusG-depleted bacterium is due to the lack of NusG 
and a resulting failure in N-mediated antitermination. 

In addition to its role in antitermination, NusG facilitates 
action of termination protein Rho; significant readthrough 
of some but not all Rho-dependent terminators has been 
observed in bacteria depleted for NusG (163). A slight 
decrease in termination at some intrinsic terminators has 
been reported. However, the main effect of NusG is on Rho- 
dependent termination (104). Through its direct binding to 
Rho, NusG was proposed to act as a bridge between Rho 
and RNA Pol (104, 121). Release of Rho from stalled tran¬ 
scription complexes is slowed by NusG, which only stably 
binds to the elongation complex in the presence of Rho (121). 

The product of the rho026 allele and the nature of its 
interaction with NusG is of interest because, as discussed 
before, rho026 confers a weak Nus~ phenotype (32). 
Rho026 differs from wild-type Rho product by not interact¬ 
ing with NusG at high temperature (12). Although the 
rho026 mutant shows reduced ability to support N action, it 
does not exhibit a significant reduction in Rho-dependent 
termination at any temperature (32). Consistent with these 
observations, the termination activity of Rho026 is less 
dependent on NusG than is wild-type Rho (12,104). Overex¬ 
pression of NusG reduces wild-type Rho activity presumably 
by competing with NusG-Rho complexes for RNA Pol bind¬ 
ing (12). NusG enhancement of Rho-dependent termination 
appears to be specific to the E. coli RNA Pol. Although Rho- 
dependent termination can be observed with T7 and Sp6 
RNA Pol, the termination is not enhanced by NusG (128). 

The NUT Site 

Information gained from genetic and structural studies as 
well as genome comparisons has extended our understand¬ 
ing of the nature and function of interactions at NUT RNA. 
The nut sites have been divided into three components: box A, 
a spacer, and boxB (figure 9-2A). Although the interactions 
at NUT involve a complex formed between all of the compo¬ 
nents, the following discussion will focus on each of the 
component sites individually. 

BoxA 

A comparison of boxA sites yielded a consensus sequence, 
referred to as box Aeon (CGCTCTTTA) (55). The nut sites of 
most lambdoid phages, with the exception of Salmonella 
phage P22 which has the consensus sequence, have slight 
variations from the consensus sequence (e.g., that of X is 
CGCTCTTac). Phages with nut sites having the consensus 
boxA sequence require a less effective N-Nus complex than 
do phages having nut sites with variations of the consensus 
boxA sequence (55). The consensus boxA suppresses the 
effect of boxB mutants that cause a greater than 1000-fold 
reduction in affinity for N (29). Because the consensus 
BOXA binds NusB (and S10) more avidly than does the X 


wild-type BOXA (108, 122), it is likely that more secure 
binding of NusB stabilizes weak binding of NusA with N in 
the absence of BOXB. 

BOXA was studied using a reporter construct whose 
relevant features include a galK reporter gene fused to the 
lac promoter with an intervening cassette containing the 
X iiutR and the downstream tRl Rho-dependent terminator 
(129). N was provided in trans and galK expression provided 
a measure of the effectiveness of antitermination. Although 
the boxA16 (CGCTaTTAC) (145) and boxA5 (CtCTCTTAC) 
(127) point mutations reduced N-mediated antitermination 
35- and 7-fold respectively, a deletion that included boxA 
and upstream sequences was not defective for N-mediated 
antitermination. Surprisingly, the deletion suppressed the 
defect in N action normally imposed by the nusB 5 mutation, 
but not the block imposed by mutations in other mis genes. 
Thus, NusB activity is required when boxA is present and 
dispensable when boxA is deleted. Experiments with the 
nusB5 mutant provided further insight into this apparent 
paradoxical finding. 

The nusB5 host, which expresses low levels of an active 
NusB (25, 112) does not support X growth due to a failure 
in N action. However, when NUT RNA with the BOXA 5 
change is overexpressed from a plasmid in the nusB5 host, 
X growth and, presumably, N action, are supported. Further, 
when X NUT RNA with the wild-type BOXA is similarly 
overexpressed, X growth is not supported. 

These results are consistent with a model in which there 
is an inhibitory factor that not only binds BOXA RNA but 
also binds more avidly to BOXA 5 RNA. Presumably, it 
is this preference for the inhibitor that causes the boxA5 
mutation to reduce the effectiveness of N action in the host 
with wild-type nus alleles. 

The Inhibitor 

In brief then, the BOXA site within the NUT RNA recognizes 
an inhibitor that interferes with the formation of an 
effective N antitermination complex. It was further proposed 
that NusB, by binding to BOXA, hinders access of the 
inhibitor to the NUT site and thereby prevents the action of 
the inhibitor (129). Results of an in vitro experiment 
showing that “substantive” levels of NusB were not 
necessary for N-mediated antitermination supported this 
model. It was argued that NusB is not necessary in vitro 
because the free inhibitor is not present (37). 

There is no direct evidence as to the nature of the 
putative inhibitor. However, there is suggestive evidence 
that the C-terminal domain of the a subunit (the aCTD) of 
RNA Pol plays a role in the action of the inhibitor. A point 
mutation in rpoA has been identified that interferes with 
N action (125), while two other point mutations (D305E 
and L280H) that alter the aCTD as well as a deletion 
( rpoAA3 ') that encodes an a subunit missing the CTD were 
shown to enhance N antitermination (151). The fact that 
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a CTD-trancated a enhances N antitermination suggests 
that, if left unchecked, the aCTD can negatively impact on 
N action; i.e., acts to inhibit N-mediated antitermination 
(50). In vitro studies by Hanna and colleagues on interac¬ 
tions of a during transcription elongation are consistent 
with these ideas. The aCTD was shown to interact with the 
nascent RNA during transcription—an interaction that is 
not observed in the presence of NusA (105). In vitro, as in 
vivo, N antitermination is effective even when the a subunit 
is missing the CTD (106). Moreover, it is unlikely that the 
aCTD itself is the inhibitor. In vitro studies using purified 
components to study N antitermination have RNA Pol with 
the complete a subunit. If the aCTD is the inhibitor, then 
NusB should be required in these in vitro reactions to block 
the inhibitory activity of the aCTD. However, NusB has little 
effect on N action in this type of in vitro antitermination 
reaction, unlike the other Nus factors that are required (37). 
Further, it is unlikely that free a is the elusive inhibitor since 
the mutant aCTDs enhance N activity in merodiploids in 
which a wild-type a is also expressed (151). What we do 
suggest is that the inhibitor, in some way, works through 
the aCTD and NusB supports N antitermination by blocking 
action of the competitor that acts through the aCTD. 

BoxB 

The specificity of the different N proteins for their cognate 
NUT site indicated that sites of N recognition are most likely 
located in the regions of the nut sequences showing signifi¬ 
cant sequence heterogeneity. Comparison of nut sites from 
different lambdoid phages identified sequences composing 
the BOXB stem-loop structures (44,99) as having the hetero¬ 
geneity that would allow for binding specificity. The first 
mutation in a nut sequence that eliminated N action was 
located in the loop of the nutL boxB (148). DNA oligo¬ 
nucleotides complementary to BOXB RNA severely inhibited 
N-mediated antitermination, while DNA olgonucleotides 
complementary to sequences immediately upstream and 
downstream of the NUT site had little effect (180). 

In an elegant set of experiments, the Das group (99) used 
hybrid nut sites and chimeric N proteins constructed from 
different lambdoid phages to conclusively demonstrate that 
BOXB is the site that provides specificity for N and that the 
amino portion of the N protein recognizes the NUT RNA 
sequence. The following provides examples of the experi¬ 
ments that led to these conclusions. An antiterminator 
tester plasmid carrying a hybrid nut site with the boxB from 
phage 21 and the boxA and spacer from X was far more 
effective in supporting antitermination when supplied with 
the N of phage 21 than when supplied with the N of X. 
Further, a hybrid N protein with the first 35 amino acids 
from phage 21 and the C-terminal 74 amino acids from X is 
active with NUT21 but not NUTX. Based on the observation 
that the amino proximal regions of all N proteins share a 
motif rich in Arg residues, Lazinski et al. (99) proposed that 


this motif is involved in binding of N to RNA. Further, they 
suggested that ARMs are important in binding of other 
proteins such as the human immunodeficiency virus TAT 
protein to RNA. Subsequent studies bore this prediction out 
(165). The ARM of N is functionally important since, if the 
Arg residues are changed, N action is impaired (45). 

Nuclease protection, mutational analysis, and band shifts 
provided important information as to the nature of N and 
NusA binding with the BOXB stem-loop (18, 21, 38,115,167). 
Both the stem-loop structure and nucleotides in the ascend¬ 
ing arm and the loop are essential for N binding to BOXB 
RNA. Not all mutations in the loop that reduce effectiveness 
of the NUT sequence affected N binding (18). 

Structural studies have addressed the question of the 
nature of N interaction with BOXB (101, 114, 149, 162, 167) 
and the following briefly summarizes some of the important 
information gained from those studies. NMR studies with 
peptides containing the ARM of X N have probed the nature 
of binding between BOXB and N. These studies revealed that 
the ARM domain of N assumes a bent a-helical structure and 
the BOXB loop assumes a structure resembling a GNRA fold, 
a tetranucleotide structure that stabilizes the RNA hairpin 
(4, 79). Consistent with such a structure, the bound BOXB 
loop includes a sheared GA pair (20), side by side pairing of 
the GA pair that is seen in GNRA folds. Genetic and struc¬ 
tural studies provide evidence that the BOXB GNRA-like 
fold is necessary for N binding. Further, the purine ring of 
adenine 7 in the loop stacks with the indole ring of Trp 18 
of N and base guanine 9 in NUTL and, by inference, adenine 
9 in NUTR are extruded from the structure (see figure 9-2 C 
for numbering of bases in the BOXB stem-loop). The BOXB 
structure differs from the GNRA tetraloop, because proper 
folding of BOXB requires peptide binding with stacking 
between the indole ring of Trp 18 of N and base A-7 of BoxB. 
Although a change at position 9 does not affect N binding, 
it affects biological activity (38) and can eliminate NusA 
binding (115). Thus, the structure of BOXB appears to be 
important for binding both N and NusA. Studies with P22 
N (80, 102) show that the P22 BOXB with bound P22 
N (also called 24) is in a structure resembling a GNRA fold 
and the ARM of the bound N also assumes a bent a-helical 
structure (14). However, in this case a different base is 
extruded from the loop sequence. The difference in the 
extruded base, which in the case of X N appears to interact 
with NusA, could explain the differences in requirements 
for NusA exhibited by the Ns of X and P22 (27,49). 


Cooperative Interactions at NUT 

We will now examine more carefully the integration of the 
individual interactions that result in the generation of the 
full complex. A stable N antitermination complex forms 
only after transcription passes through the nut site. This 
complex contains N and the Nus factors (8, 31,81). 
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Mobility shifts were used to assess binding of the other 
E. coli factors to a core complex consisting of NusA, N, 
and RNA Pol, with X NUT RNA (116) .When the RNA had a 
wild-type X BOXA sequence, binding of NusB, S10, and NusG 
was found to be cooperative: only when all three proteins 
were present was the core complex effectively shifted. Most 
changes in BOXA away from consensus reduced this bind¬ 
ing to the core complex. However, when the NUT site had 
BOXACON (the consensus BOXA sequence), NusB and S10 
were sufficient to shift the core complex. 

These results appear to contradict previously discussed 
in vivo experiments showing that, when BOXA and some 
upstream sequences were deleted, S10 was required for 
N action even though NusB was not (129). However, the 
interaction of S10 in the absence of NusB and/or a proper 
BOXA sequence with the core complex may not be observa¬ 
ble by gel shift assay. 

Employing a technique called an immunoprint, anti-N 
antibody was used to precipitate transcription complexes 
with labeled nascent RNA (8, 31). The in vitro transcrip¬ 
tion reaction included N protein, an S30 extract, and a 
DNA template with a nut site. Transcription was stopped 
throughout the length of the template using chain-termi¬ 
nating nucleotides. The RNAs isolated from the precipi¬ 
tated complexes were separated by gel electrophoresis and 
the results could be used to determine when N became 
associated with the transcription complex. Controls 
showed that precipitable complexes were not formed if a 
purified NUT containing RNA was mixed with RNA Pol 
and the other components of the reaction: i.e., the anti-N 
antiserum precipitated only components in an elongation 
complex. 

One set of experiments using the immunoprint technol¬ 
ogy showed that N association with the transcription 
complex is observed beginning with those transcripts that 
have reached the distal end of the nut site (the 3' end of 
boxB ) and those that have progressed beyond that point (8). 
In these experiments, N and the S30 extract were present 
during the entire reaction. This result is consistent with the 
NUT RNA directing acquisition of the N-Nus complex. 

A second type of experiment using the immunoprint 
technology studied acquisition of N by paused transcription 
elongation complexes preformed in the absence of N and 
S30 extract (8). N and S30 extract were added to paused 
RNA polymerases that had been transcribing the template 
containing a nut site. Only one stable immune complex was 
precipitated with anti-N antibody: that transcription 
complex had proceeded just four nucleotides beyond the 3' 
end of the BOXB stem-loop. Moreover, stable complexes that 
are precipitable with immune serum did not form with 
transcribing RNA Pol that had passed the nut site. It was 
only during the limited time when the BOXB structure was 
still forming that N could be acquired. This provides 
strong evidence that the antitermination complex with 
N assembles only when the RNA Pol is at the nut site. 


Under completely different experimental conditions using 
a purified transcription system, it was shown that RNA Pol 
can be modified to antiterminate even when NusA and N are 
supplied after transcription has advanced a significant 
distance past the nut site (180). That is, the NUT RNA can 
apparently bind N and NusA and loop over to attach to 
the distally located RNA Pol. Although the latter results 
are intriguing, we find the data with S30 extracts more 
compelling because the conditions of these experiments are 
more representative of what happens in vivo. 

The results of the immunoprint experiments with the S 30 
extracts suggest that the initial interaction of N with NUT 
RNA may be very different than the one proposed by studies 
of N binding to purified NUT RNA. We interpret these S30 
results as showing that N binds to the sequences of the 
ascending stem and loop of BOXB that have advanced out 
of the RNA-DNA hybrid in the absence of formation of the 
stem-loop structure. When transcription reaches four bases 
beyond BoxB and Nut, the RNA pol-DNA-RNA ternary 
complex is presumably in its native transcription elonga¬ 
tion state. In this state, the 3' end of the RNA is within 
the active center and the upstream eight nucleotides are 
in a hybrid with the template DNA (96, 155, 184). An addi¬ 
tional six nucleotides upstream of the RNA/DNA hybrid is 
protected from nuclease as it passes through the RNA Pol 
to the exterior (95). At this point, the upstream part of 
NUT, which includes BOXA, will be completely extruded 
outside of the RNA Pol and thus should be able to bind 
NusB and S10 (NusE). The BOXB stem, however, cannot yet 
be formed, as most of the downstream side of the stem will 
still be trapped in the RNA/DNA hybrid. How can N bind 
to the complex if, as suggested by a number of biochemical 
and structural studies with pure components, N binding 
requires that the BOXB stem with its tetraloop must form? 

Because BOXB is within the RNA Pol complex at this +4 
position, it is possible that this association alters the RNA so 
that it binds N even though the BOXB stem-loop has not 
formed. According to this scenario, formation of the BOXB 
stem-loop would be unnecessary for the initial binding 
of N. As the RNA is extended, the stem sequences would 
be released from their tight association with RNA Pol 
and the BOXB stem-loop could form. Retention of N, and 
thus maintenance of the antitermination complex, would 
then require that N bind to this newly formed structure. 
Processive antitermination would proceed with N binding 
to the BOXB stem-loop and association of the other Nus 
factors through binding at BOXA and by protein-protein 
interactions. 

To explain the rapid assembly of the antitermination 
complex at the NUT site, we consider N action in light of 
an idea originally proposed by Das and collaborators (33). 
As they suggested, N and NusA are bound in a loose associa¬ 
tion with RNA Pol prior to reaching the nut site. When 
NUT RNA is synthesized, N can rapidly slide into position 
on the BOXB sequences comprising the leading arm of 
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the BOXB stem and the loop region. This can occur prior to 
the formation of the BOXB structure, while sequences 
composing the descending stem are still in the RNA-DNA 
hybrid. As discussed above, this is apparently the one posi¬ 
tion during transcription when NUT RNA can bind N and 
Nus factors to initiate formation of a stable RNA-N-Nus- 
RNA Pol complex. Hence, the role of BOXB may not be to 
increase the local concentration of N to facilitate binding 
to RNA Pol but to allow N to form a stable complex. RNA 
Pol with NUT RNA and N and Nus proteins then becomes 
a processive transcription complex (34,113). 

N Action 

How does N-modified RNA Pol become antitermination- 
proficient for both Rho-dependent and intrinsic termina¬ 
tors? At intrinsic terminators N could shift from the NUT 
stem-loop temporarily to one arm of the terminator to tran¬ 
siently block formation of the terminator stem-loop (76). We 
suggest that at Rho terminators, the properly positioned 
NusA could block access of Rho to the DNA-RNA hybrid. 
This would prevent Rho from destabilizing the hybrid. 
For alternatives on the possible mechanism of N, the reader 
is referred to other reviews (33, 73, 133). These models 
primarily address the question of antitermination at intrin¬ 
sic terminators. Therefore, we will look in more detail at 
N-mediated antitermination at Rho-dependent terminators. 

We view NusA as likely being the critical element that 
prevents Rho-dependent termination. In addition to its 
action in N and Q antitermination systems, NusA plays a 
role in antitermination in the rrn operons (169). All these 
antitermination systems support readthrough of Rho- 
dependent termination sites. Importantly, NusA can have a 
direct effect on Rho termination. In a pure transcription 
system with only DNA, RNA Pol, and Rho present, termina¬ 
tion is reduced by NusA at Rho-dependent terminators 
(11,98,156). 

One key observation for understanding the role of NusA 
in promoting readthrough is that in an in vitro transcrip¬ 
tion system NusA prevents termination at the first (site I) of 
three subsites within the Rho-dependent terminator tRl. 
However, NusA does not prevent Rho-dependent termina¬ 
tion at the downstream sites II and III in tRl (98). Above we 
argue that NusA occupies a weak binding position on RNA 
Pol, and it is this binding that we propose interferes with the 
initial interaction of Rho with RNA Pol. 

We suggest that Rho displaces NusA from RNA Pol at the 
first subsite but fails to terminate transcription. However, 
Rho is now in a position to terminate transcription at all 
of the downstream sites. Studies with the nus All (Ts) muta¬ 
tion (118) provide an insight into how NusA and Rho may 
interact. This mutation causes a Rho-defective phenotype 
with characteristics similar to the rhotsl5 mutation (118; 
D. L. Court, unpublished data). NusAll appears then to 
act as an antagonist of Rho activity, possibly by binding 


more tightly to RNA Pol than does wild-type NusA, and 
thereby becoming more resistant to Rho-imposed dissocia¬ 
tion from RNA Pol. Identification of a second-site mutation 
in rpoC (encodes the (3' subunit of RNA Pol) that suppresses 
the Rho-defective and temperature-sensitive phenotypes 
of nus All is consistent with this idea (85). The altered (3' 
subunit could reduce the binding of NusAll to RNA Pol. 

NusA has been postulated to modulate Rho activity in 
another way. By stimulating pausing of RNA Pol, NusA 
could facilitate coupling of transcription and translation 
(16, 17, 42, 147, 189). The length of the gap between the 
transcribing RNA Pol and the trailing ribosomes is thought 
to influence Rho action; i.e., the larger the gap, the larger 
the target for Rho action (1). By binding tightly to RNA Pol, 
NusAll could increase pausing of RNA Pol. This would 
allow the translating ribosomes to keep up with the tran¬ 
scribing RNA Pol; i.e., stronger coupling of transcription 
and translation and, in turn, reduced Rho activity. 

Thus, NusA could reduce Rho termination by interacting 
with RNA Pol in two independent ways. How does this 
relate to the antitermination activity of N? We suggest that 
merely by facilitating the interaction of NusA with RNA Pol, 
N interferes with Rho action. 

Variation of the NUT Site 

H-19B (158), a lambdoid phage (82) carrying the gene encod¬ 
ing a Shiga toxin, stx, differs from the X paradigm in both its 
nut site and host requirements (119, 120). H-19B shares the 
same N-NUT system with two other identified lambdoid 
phages; 933W (130) (another phage carrying an stx gene) 
and HK97 (88). This type of nut site (figure 9-2A) has two 
major differences from the X paradigm (120). The first differ¬ 
ence is in the boxA sequence, which is very degenerate 
compared with those of the X type. Unlike X, H-19B grows 
well in E. coli with either or both the nusB5 (or a nusBr.cam 
insertion mutation) and nusE71 mutations (120). Hence, this 
phage, which does not appear to have a functional boxA 
sequence of the X type, does not appear to require NusB or 
S10 (NusE), two proteins known to bind to wild-type BOXB 
(55,108,122,129). 

The second unusual feature of the H-19B type of nut sites 
is the presence of hyphenated dyad symmetries in the spacer 
regions (figure 9-2A): two in nutL and one in nutR (120). 
Mutational analysis of nutR showed that nucleotide substi¬ 
tutions weakening formation of the spacer stem structure 
interfere with NUT activity. However, compensating muta¬ 
tions that change the sequence but allow stem structure 
restore NUT function. Interestingly, sequences composing 
the entire stem-loop can be deleted without a significant 
loss of function. It was concluded that the stem structure 
serves to reduce the linear distance along the NUT RNA, in 
this way serving as a “reducer” by bringing separated regions 
of the NUT sequence together into a functional unit (120). 
The sequences in the “reducer” stem-loop of 933W and 
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HK97 are the same as those in Et-19B. This evolutionary 
conservation suggests that there may be an additional, as 
yet unidentified, role for the reducer stem-loop. 


Translation Regulation of N Expression 

The importance of N as a regulatory protein is reflected in 
the number of ways the expression of the N gene is 
controlled. As the first gene of the p L operon, the N gene is 
regulated at the transcription level by Cl and Cro repression 
p L (131). At the post-translation level, the Lon protease 
degrades N protein causing a relatively short protein half- 
life (68). Studies with X have identified two other regulatory 
mechanisms that modulate N expression, at the level of 
translation, through sites encoded in sequences in the 223 
base long N-leader (figure 9-3). The first of these sequences 
is the nutL site and the second is a stem-loop structure sensi¬ 
tive to the double-stranded RNA-dependent endonuclease, 
RNaselll (24). This site, the N ribonuclease III (three) site, is 
referred to as N(RTS) (35). 

The following discussion of these regulatory mecha¬ 
nisms is based primarily on three papers — Wilson et al. 
(181), Kameyama et al. (90), and Wilson et al. (182) — and is 
illustrated in figures 9-3 and 9-4. It was surprising to 
discover that the NUT site, which we have seen is a key 
element in the action of N in modifying the transcription 
complex, is also involved in controlling N synthesis at the 
level of translation. Even more confounding was the discov¬ 
ery that the very Nus products required for Ns tran¬ 
scription antitermination activity are also required for this 
translation repression. This regulatory process relies on a 
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Figure 9-3 The /V-leader transcript beyond P L .z The sequence 
is shown starting at the BOXA sequence of NUTL; numbers 
indicate distance from RNA start of P L . The structure of 
the RNaselll site is shown with the position of cleavage 
sites marked by arrows. The N ribosome-binding site is 
underlined. 


complicated interplay between the N(RTS), the NUT site, 
and the N ribosomal binding site. The RNaselll-sensitive 
stem-loop structure can be viewed as being a central player 
in translation control. Located just prior to the N gene this 
large stem-loop structure in the RNA has two roles: first, it 
acts as a direct translation inhibitor, preventing ribosomes 
from easily binding at the N initiation region; second, it 
holds the N gene initiation codon in the correct position rela¬ 
tive to NUT for autorepression when N and the Nus factors 
bind to the NUT site. The stem is a substrate for RNaselll 
(107), and its cleavage by RNaselll prevents both types of 
repressive effects on N synthesis. Hence, translation repres¬ 
sion is best observed in an RNaselll mutant. In the absence 
of cleavage of the N(RTS), the newly expressed N protein acts 
by binding at the NUT site to block access to the N ribosome¬ 
binding site. This activity of N, like the antitermination 
activity, requires the participation of the Nus factors. 
Translation repression is specific for expression of N since 
translation of downstream genes is not affected. Cleavage 
by RNaselll separates this ribosome-binding site from the 
NUT site and thus eliminates N repression. An N leader 
deleted for sequences encoding the entire N(RTS) maintains 
N-mediated translation repression. This shows that the 
stem structure itself is not essential for translation repres¬ 
sion by N. RNaselll does not prevent translation repression 
when the N(RTS) is not present because the processing 
site has been removed. Hence, RNaselH’s role in translation 
repression is due to its action at the N leader site. 

In summary, the N(RTS) and RNase III action at that 
site regulate N expression by influencing translation 
initiation. The structure per se partially interferes with 
ribosome binding and N-mediated repression provides an 
additional block by sterically inhibiting ribosome binding. 
RNaselll and N expression both increase with increasing 
growth rate (182) consistent with the hypothesis that 
RNaselll activity stimulates N gene translation through 
cleavage of the inhibitory hairpin in a growth-rate- 
dependent manner. The relationship between RNaselll 
activity and N expression can thus tie lambda development 
to the physiology of the host cell. 

The HK022 PUT Sequence 

Phage HK022 has PUT sequences downstream of the p L and 
p R promoters in place of the NUT sequences found in other 
lambdoid phages. Although other phages with PUT sites 
have recently been identified (R. Farooque and R. King, 
personal communication), all the work to date on this anti- 
termination system has focused on HK022. Unlike NUT, PUT 
appears to modify RNA Pol to a termination-resistant form 
without the aid of either phage- or host-encoded proteins 
(22, 92, 124). The put sites (figure 9-2B) are composed of 
two regions of hyphenated dyad symmetry separated by 
one nucleotide (6, 22). Studies with mutant put sites show 
that PUT RNA must form both stem-loop structures to be 
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N is the first X gene expressed from the pL operon 
Once made N-mediated antitermination allows 
expression of the other X genes 



N antitermination 



Without RNase III cleavage, RNA involved in RTS structure retained 
Stem-loop structure forms and aligns N for autoregulation 
Bound N-Nus complex blocks ribosome binding site 


RNase III concentrations critical to N regulation and expression 

RNase III levels rise and fall with growth rate 

Causes N levels to rise and fall with growth rate 

*X can use N regulation by RNase III as a sensor of cell growth 


Figure 9.4 Assembly of transcription antitermination complex. N translation repression modes. RNA polymerase (RNAP) is 
shown transcribing from the P L promoter through nutL and the RNase III site (RTS) into the N gene. The antitermination 
complex is formed first at NUTL (to the right of step 1) and the RTS is transcribed and assembles beyond step 2. If RNaselll is 
not immediately present the Shine-Dalgarno (SD) is protected in some way from ribosome binding. If RNaselll is present 
then the RTS is processed and this allows optimal translation of N. Note that the antitermination complex may remain bound 
to RNA polymerase. See thebacteriophages.org/frames_0090.htm for a color version of this figure. 


active (92). Moreover, to be effective, the PUT site must 
remain tethered to RNA Pol (153). 

The nusAl mutation does not influence PUT activity (124) 
and the search for host mutants uncovered mutations only 
in sequences encoding the amino proximal domain of the |3' 
subunit of RNA Pol (22). These changes occur in a region 
with a zinc finger motif (having a group of four invariant 
cysteine residues) and did not confer any other observable 
phenotype. Nuclease protection experiments on stalled tran¬ 
scription complexes at put showed that the central part of 
the downstream stem and the 3' arm of the upstream stem 
were specifically protected (153). Moreover, once the RNA 
Pol passed through the put site, the DNA that had already 
been transcribed was shown to no longer be necessary for 
downstream transcription antitermination. An rpoC RNA 
Pol mutant defective for PUT-directed antitermination 
failed to show significant protection of PUT RNA. 

Based on the idea that the large number of basic residues 
in the zinc finger may serve as a surface that interacts with 
RNA, Sen et al. (152) changed each of the basic residues in 
the zinc finger to Ala. Although none of these changes was 
lethal to the bacterium, a number of them failed to support 
growth of HK022. Many of the mutants functioned better 
with putL than with putR, leading Sen et al. (152) to propose 


that the zinc finger distinguishes between the stem-loop of 
the two PUT RNAs. Because some of the mutant rpoC 
products were more defective in vitro than in vivo for the 
same PUT action at the same terminators, it was suggested 
that there may be additional factors active in vivo supporting 
PUT-mediated antitermination despite the genetic evidence 
to the contrary. 

The HK022 Nun Protein 

Although not identified as a regulatory protein for its 
own genome, the action of the HK022 nun gene product is 
so tied to the action of X N that it warrants discussion in 
this review. The HK022 nun gene is located in the same 
relative position on the HK022 genome as the various N 
genes are located on the genomes of other lambdoid phages. 
However, the Nun protein does not act as an antitermination 
protein for HK022 but, instead, acts by terminating tran¬ 
scription on the X genome at sites not normally involved in 
termination (reviewed in: 67, 177, 178). Expressed by the 
HK022 prophage from a constitutive promoter immediately 
upstream of the nun gene (93), Nun acts specifically with 
NUT sites of the X type, but not with NUT sites differing 
from that of the X type (136). Acting through the X NUT 
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sites, Nun arrests transcription initiating from the early X 
promoters (83). As shown for transcription initiating at the 
X p L promoter, Nun action generates short RNAs spanning 
100 nucleotides downstream of nutL (157). Action of Nun, 
like action of X N, requires the four Nus factors (83, 144). 
Consistent with these requirements, Nun competes with X 
N for binding to the X BOXB RNA (19, 83). As observed with 
N, the amino portion of Nun is rich in Arg residues (ARM) 
and binds to the BOXB structure in a manner similar to the 
binding of N (19, 41, 77). However, in vitro Nun arrests tran¬ 
scription and blocks the translocation of RNA Pol without 
causing its release, and in vivo terminates transcription, 
causing dissociation of the ternary complex (83, 84). In vivo 
studies with hybrid proteins expressed from hybrid N-nun 
genes provide evidence that the nature of binding to BOXB 
does not determine the mode of action of Nun or N. For 
example, a hybrid N-NUN protein having the amino one 
third from NUN and the carboxy two thirds from N acts at 
the X NUT site to promote transcription antitermination (77). 

The folding of the wild-type Nun protein inhibits its own 
binding to RNA, an inhibition that is relieved by Nun binding 
of NusA. The self-inhibition, which is caused by the C-termi- 
nus folding back on the ARM, is particularly noticeable in the 
presence of Zn 2+ (174). 

Trp 108, the penultimate amino acid, is essential for Nun 
action. A transcribing elongation complex including the 
nascent RNA that is modified by a mutant Nun protein miss¬ 
ing this Trp residue can freely switch from one DNA template 
to another DNA template, an action not permitted by the 
wild-type Nun protein. It was proposed that Nun interferes 
with transcription elongation by a braking mechanism; the 
Trp residue on the Nun protein bound to RNA Pol interca¬ 
lates into the minor groove of the DNA (176). A crosslinking 
experiment confirmed that, at sites of Nun-mediated termi¬ 
nation, the ultimate amino acid on the Nun associated with 
the stalled RNA Pol is linked to the DNA template (175). 

Recent studies suggest that Nun, like N, represses the 
translation of the X N gene through a complex formed at 
NUTL (M. Gottesman and D. Court, unpublished result). 
This provides confirmatory evidence that highly similar 
mechanisms are used by both N and Nun to carry out 
antitermination and termination, respectively, using the 
NUT site. In fact, under very special conditions, Nun, 
acting through a mutant NUT site, does not terminate X 
transcription but, in a limited way, suppresses transcription 
termination (145). 

The Q Transcription Antiterminator 

The Q gene of X, located 5.8 kb beyond the p R promoter and 
distal to at least four transcription terminators, requires N 
antitermination for expression. 0 function, in turn, is 
required for full expression of the phage late genes (39, 87) 
and serves as a second antitermination factor (figure 9-1) 
(43, 139). Q, like N, can antiterminate Rho-dependent and 


Rho-independent terminators (58,186). As we will see the 
requirements for and the mechanics of modification of the 
RNA Pol complex are very different for Q and N. 

0, like N, has a specific binding site, qut (Q utilization), 
which is associated with the strong p R ' late promoter 
(figure 9-1) (159). However, unlike N, which works with X 
and non-7. promoters provided they are upstream of the nut 
site (36), Q binds to the qut DNA sequence and, because qut is 
an integral part of the p R ' promoter, only antiterminates 
transcription initiating at p R ' (143, 187). Unlike N, which 
binds RNA and interacts with a large complex of host Nus 
factors, 0 stabilizes polymerases initiating from the p R ' 
promoter independently of host factors. However, NusA has 
been shown to stimulate antitermination activity of the X 
Q from 3- to 10-fold in vitro (69), but only marginally influ¬ 
ences antitermination activity of the Q of another lambdoid 
phage, 82 (186). Once Q has modified RNA polymerase at p R \ 
the antitermination complex continues through the Rho- 
independent t R ' termination site 196 bases downstream and 
ultimately through the entire late lytic gene operon, which is 
26.5 kb long, with very little drop-off in transcription level. It 
is not known whether any transcription terminators exist 
beyond t R ' in this long segment. Nor is the terminator 
known that stops Q-dependent transcription beyond the 
late genes. N antitermination can also proceed over long 
distances as demonstrated by transcription from the p R 
promoter in the prophage extending all the way to the 
bacterial gal operon more than 20 kb away (2). The level of 
N antiterminated transcripts that reach the gal operon is 
greatly reduced relative to those that began at the p L 
promoter. Between p L and gal there are many terminators, 
and it has been shown that N does not completely overcome 
them. However, it is not known whether 0 antitermination 
is highly processive or whether there are just very few 
terminators in the late operon. In any case, transcription 
over the entire length is well maintained. 

Although 0 activity is more efficient in cis, 0 can act in 
trans (40). For example, an infecting X 0 + phage can 
activate antitermination and late gene expression from 
the p R ' promoter -qut site of a repressed Xi wm 444 prophage, 
a phage with a different immunity region but the same 
p R '-qut as X (28). Note that in the lysogen the p R ' promoter 
is constitutively active and is primed for Q action (140). 
All identified lambdoid phages have O-like functions. Some, 
like those of P22 and X, are nearly identical and func¬ 
tionally interchangeable (142): a 0 from one will activate 
antitermination from the qut site of the other. On the other 
hand, the Q proteins of phages 82 and 21 vary greatly in 
sequence from each other (as well as from the Q of X) and 
only activate their own qut sites (186). 

Roberts and collaborators have provided the bulk of our 
knowledge about how Q protein causes antitermination. 
Their purified in vitro transcription studies have answered 
many of the initial questions. Early on it had been deter¬ 
mined in vitro that X 0 protein and the X p R ' promoter 
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region as well as the first 20 or so bases downstream of 
the p R ' start are required to form a transcription anti- 
termination complex with RNA Pol holoenzyme (159, 185). 
0 has a limited window during the transcription process 
to modify polymerase; it does not act during the transcrip¬ 
tion initiation event at p R ' nor does it alter the polymerase 
elongation complex once the complex is well away from 
the promoter (69). This window appears to be the brief 
transition time between the initiation stage and the elonga¬ 
tion stage of RNA transcription (143). The details of these 
initial requirements have now been worked out and a more 
complete picture of the process is available. 

RNA Pol initiating transcription at p R ' pauses after 
synthesizing a transcript of 16 nucleotides (69,137). At that 
point, polymerase elongation is delayed by a specific bind¬ 
ing of its ct /(i component (75) to the nontemplate strand of 
the DNA bubble (94, 134, 135). This binding occurs down¬ 
stream from the promoter start at a site that resembles 
a —10 promoter sequence. Thus, ct / 0 rebinds to the DNA 
after initiation but before it can be released from the stable 
elongating RNA Pol complex. This rebinding causes a 30 
second pause in vitro and a 25 second pause in vivo during 
a transcription without 0 protein present. 

Marr et al. (110) examined the nature of the paused RNA 
Pol. They found that the paused complex with ct 70 is very 
similar in structure to the open complex of RNA Pol found 
at promoters. It differs, however, in having a 16 nt RNA 
product partially hybridized to the template strand within 
the transcription bubble. 0 binds to this paused intermediate 
making contact with the DNA upstream of the transcription 
bubble. In this elongation type of “open complex,” ct / 0 
contacts the DNA slightly differently from the way it does in 
the open complex found at the promoter (110,135). Region 4 
of ct, the segment of ct that contacts the —35 segment of the 
promoter, contacts Q. Region 2, the segment of ct that binds 
the —10 region of promoters by contacting the unpaired 
nontemplate strand in the open complex, binds the nontem¬ 
plate strand at the pause site in much the same way as at the 
promoter. Region 3 of ct, however, is not properly positioned 
relative to the rest of ct when compared with its position on 
the open complex at the promoter. This probably is caused by 
the presence of the RNA in the complex. When 0 is present, it 
displaces region 4 of ct and binds the DNA segment flanking 
what would be the —35 position of the paused “open 
complex.” This position, of course, is moved downstream 
relative to the p R ' promoter and is located between what 
were the —35 and —10 sequences of the normal promoter. 
0 binds to the DNA as a dimer (110, 188) activating the 
elongation complex for antitermination. As RNA Pol moves 
away from the promoter, ct / 0 is dissociated and 0 joins 
the RNA Pol complex. 

The specificity and interactions of 0 and ct in the paused 
complex have been confirmed by genetic studies. Mutations 
in the qut site at positions 15 and 13 bases upstream of the 
normal transcription start site prevent O binding but do not 


prevent pausing at +16. These mutations define the 0 bind¬ 
ing site on the DNA. Mutations in the downstream ct binding 
site, at positions 2 and 6 bases beyond the normal tran¬ 
scription start, prevent renewed ct binding and pausing of 
polymerase at +16. All these mutations prevent antitermina¬ 
tion, demonstrating the functional requirement for both O 
and ct binding (89,186,187). Using the +2 and +6 mutations 
that prevent ct rebinding, RNA Pol can be artificially paused 
at the +16 position of the mutants by withdrawing appropri¬ 
ate nucleotide substrates. This artificially paused polymer¬ 
ase cannot bind 0 and fails to antiterminate transcription, 
demonstrating the special need for ct binding and a likely 
interaction of ct with Q (135,187). 

Although only RNA Pol holoenzyme initiating from the 
p R ' promoter and its associated qut site are required for 0 
antitermination in vitro, other components stimulate 0 anti- 
termination as judged by in vitro and in vivo experiments. 
When polymerase is trapped at the pause site by ct binding 
to the nontemplate strand, it becomes constrained and 
tends to slip backward into an arrested complex in which 
the 3' end of the RNA is no longer at the active center of poly¬ 
merase. O does not bind or modify this back-tracked form of 
polymerase. Two factors associated with RNA Pol, GreA and 
GreB, cause polymerase to cleave the RNA at the active 
center in a back- tracked state (9) allowing polymerase to 
resynthesize to the +16 pause and thereby providing Q 
with another chance. Roberts and coworkers have shown 
this stimulation by GreA and GreB using both in vitro and 
in vivo studies. E. colt defective for GreA and GreB have 50% 
reduced Q-mediated transcription antitermination (111). 
However, even with reduced antitermination X can still 
form plaques on such greA-greB mutant strains. 

NusA protein stimulates X O antitermination at down¬ 
stream terminators and has an effect very early in stabilizing 
the binding of Q to the paused complex (185). Results in vitro 
suggest that NusA displaces the ct subunit from RNA Pol in 
the transition from initiation to elongation (61, 62, 71). Like 
NusA, O binds and displaces ct from the paused complex as 
the O-modified termination resistant elongation complex is 
established (188). Perhaps NusA functions catalytically in 
helping O to displace c from the complex. Still, the mechan¬ 
ism by which NusA stimulates Q is not known. Also, NusA 
may remain with the Q-RNA Pol elongation complex in a 
tighter than usual association with RNA Pol. 

The combined action of the unique rpoCIO mutation, 
altering the P' subunit of RNA polymerase, with certain 
mutations in nusA has been shown to prevent X growth and 
specifically 0 antitermination. In this case, O overproduc¬ 
tion reverses the X defect (85). 0 levels are important here, 
perhaps because there is a competition between Q and 
NusA for polymerase in the paused complex or because this 
combination of the defects in P' and NusA makes displace¬ 
ment of ct more difficult. 

Once Q binds to the paused complex it rapidly moves poly¬ 
merase into the elongation complex and displaces ct (188). 
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The elegant physical studies of Marr et al. (110) demon¬ 
strate the O-dependent repositioning and displacement of 
a-specific binding domains on the DNA at the pause site. 
This alteration of a binding by Q presumably weakens ct's 
tenuous hold on the nontemplate strand, freeing the paused 
complex to continue elongation. The bound Q protein has 
replaced a and has, in some way, still not understood, 
modified the elongation complex for antitermination. In the 
new complex, 0 is presumed to interact with the core 
polymerase. Genetic studies suggest that regions in (3 and (3' 
may affect this binding as mutations in rpoB and rpoC affect 
antitermination action of 0 (5, 85). 

Once the elongating RNA Pol is modified by 0, its proper¬ 
ties change in two ways: it transcribes faster, pausing less on 
the DNA, and passes through transcription terminators (186, 
188). This raises the question as to whether increased tran¬ 
scription speed could cause antitermination. Experiments in 
which the speed of polymerase is reduced dramatically by 
limiting substrate nucleotides demonstrate that 0 antitermi¬ 
nation remains functional. In fact, 0 can reduce dissociation 
of an elongation complex that is stopped at a terminator and 
would normally be released in the absence of 0. Thus, 0 
modifies the transcription complex by making it more 
stable and resistant to dissociation. This complex with 
increased stability may be able to transcribe DNA more 
rapidly (188). 

In a study looking at both transcription termination and 
0 antitermination, Yarnell and Roberts (188) demonstrated 
that a simple antisense oligonucleotide could cause tran¬ 
scription termination of paused polymerase complexes. The 
antisense segment had to be appropriately located upstream 
of the pause in the region normally occupied by the stem 
structure of an intrinsic terminator. When added to static 
elongation complexes stopped at the pause site, these short 
antisense DNAs released the RNA and polymerase from the 
DNA complex. If Q was part of the polymerase complex, 0 
prevented release just as it can prevent normal termination. 
Because the antisense oligonucleotide pairing, like the stem- 
loop structure, is believed to simply dissociate the RNA/DNA 
hybrid of the paused complex, these authors proposed two 
general ways Q could work. It could be bound to polymerase 
upstream of the RNA/DNA hybrid in such a way as to 
prevent base pairing of the stem structure, or alternatively 
0 could bind and stabilize the RNA/DNA hybrid itself. 
Direct stabilization of the RNA/DNA hybrid by 0 could also 
prevent Rho-mediated termination events. Although we do 
not know how Rho interacts with RNA Pol to cause termina¬ 
tion, it must ultimately destabilize the RNA/DNA hybrid (10). 
We have previously suggested that NusA blocks Rho action 
under certain conditions. Based on this idea, we further 
suggest that NusA may serve to facilitate readthrough of 
Rho termination by Q-modified RNA Pol. According to this 
idea, 0 stabilizes NusA interaction with RNA Pol. Thus, 
based on our suggestion that the NusAll protein success¬ 
fully competes with Rho for binding of RNA Pol (see the 


section on N action, p. 91), we raise the possibility that 0 
stabilization of the NusA-RNA Pol interaction facilitates 
NusA competition with Rho. 

Coda 

Transcription antitermination can serve two different roles 
useful to the phage. First, it allows the phage to acquire new 
genes even if the genes have associated transcription termi¬ 
nators. If an acquired gene locates within an operon, the 
antitermination mechanism enables transcription to reach 
downstream genes even if the acquired gene has an asso¬ 
ciated downstream terminator. For example, in phages 
carrying genes encoding Shiga toxin, even though the stx 
genes have an associated promoter, p stx , transcription of 
genes downstream of the stx genes only occurs from the 
p R ' promoter, which is upstream of p stx (119, 170). Second, 
the antitermination system can serve as a regulator of 
gene expression. The regulation can be temporal and/or 
physiological. 

In considering transcription antitermination in the lamb- 
doid phages, it is curious that these phages employ three 
different strategies to achieve antitermination. The PUT 
system appears to rely solely on an RNA structure. The N 
system relies on RNA sequences and structure as well as 
both phage- and host-encoded proteins. The O system relies 
on a DNA binding site within the promoter, synthesis of 
a short transcript from that promoter, and a phage and a 
host protein. What advantages do these different strategies 
provide to the phage? 

Antitermination systems that rely on interactions of 
sequences in the RNA are always found in the early operons, 
while those that rely on interactions at sequences in the DNA 
are found in late operons. We suspect that this arrangement 
of antitermination sequences is not fortuitous, but reflects 
some selective advantage to the phage. Regulation of gene 
expression of early functions is most important in temperate 
phage development because it is during the expression of 
these genes that the decision between lysogeny and lysis is 
made. We have presented studies showing how the N-NUT 
system provides regulation of gene expression at the levels 
of transcription and translation. In a subset of phages 
(those with N-NUT systems like that of H-19B), formation of 
an RNA structure per se within the NUT region, called the 
“reducer," appears to contribute to the effectiveness of the 
NUT site at directing antitermination. When the sequences 
encoding the reducer structure are removed, the effective¬ 
ness of the NUT site in directing N-mediated antitermination 
is only modestly reduced. In X, formation of a stem-loop 
structure per se in the N leader regulates the effectiveness 
of N-mediated antitermination in another way, by control¬ 
ling the level of N expression. When present it serves to inhi¬ 
bit expression of N and its removal by RNase III allows 
higher levels of N to be expressed. Thus, modulation of the 
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action of RNase III provides a mechanism to allow input of 
the physiological state of the bacterium into the regulation 
of N expression. The requirement for the action of a number 
of host proteins also provides additional potential targets for 
regulatory action. 

The situation is different for late gene expression. Once 
the decision for lysis has been made, monitoring of late 
gene expression is unnecessary and expression can essen¬ 
tially be constitutive. The level of 0, which is determined by 
activities regulated by actions in the early operons, deter¬ 
mines expression of late genes. Thus, it is likely that proper 
phage development requires less regulation of late gene 
expression. Although the X 0 activity is enhanced by NusA 
in vitro, the activity of the Q of phage 82 is only modestly 
enhanced by NusA (137, 186). Moreover, none of the other 
Nus factors appear to contribute to Q action (7). Although 
NusA may allow some regulation of 0 action at qut, there is 
obviously not the panoply of possible regulatory inputs that 
are available through the NUT RNA site. 

Results of studies with phage HK022 appear to contradict 
the idea that the selective advantage of an RNA antitermina¬ 
tor is that it provides more opportunity for more regulatory 
inputs. The PUT RNA antiterminator located in the early 
operons of HK022 appears to function independently of 
either phage or host auxiliary proteins, unlike the N-Nus 
modulated NUT RNAs. If the PUT stem-loop and RNA Pol 
interact in the absence of other inputs and they alone are 
sufficient to direct processive transcription through down¬ 
stream terminators, it would be difficult to argue that the 
antitermination mechanism in the early operon provides an 
important regulatory capability. PUT RNA is a component 
of every initiated transcript and therefore it would appear 
axiomatic that every RNA Pol becomes antitermination- 
proficient as soon as the put sequence is transcribed. In vivo 
genetic evidence suggests that only the RNA structure is 
essential for PUT-mediated antitermination. However, two 
observations showing that PUT-mediated antitermination 
can be more effective in vivo than in vitro suggest that, 
as yet, unidentified regulatory controls are exerted on 
the structure and/or function of PUT in vivo. Firstly, the 
antitermination activity observed with a DNA template 
having a put site is not as efficient in a pure transcription 
system with RNA Pol as it is with such a template in vivo. 
Secondly, some rpoC mutations cause a greater decrease in 
PUT effectiveness in vitro than in vivo (92,152). Experiments 
with X provide direct evidence that N plays a regulatory role 
in phage development (181). X forms turbid plaques on E. coli, 
and this indicates that some of the progeny phages formed 
in the localized infection go down the lysogenic route. 
However, the same X forms clear plaques if the E. coli lawn 
constitutively expresses N. This indicates that very few of 
the progeny phages go down the lysogenic route if N 
is present from the beginning of the infection. Hence, the 
timing and perhaps the level of expression of N is an 
important factor in determining how the phage develops. 


We have already discussed how expression of N can be 
regulated; establishing how PUT is regulated or whether 
it is regulated awaits further work. 

While many of the details of the action of X antitermina¬ 
tors have been elucidated, precisely how they enable RNA 
Pol to transcend terminators awaits further work. We hope 
that this chapter has provided new ways to think about this 
process that may serve to stimulate further experimentation. 


Acknowledgments 

The authors thank Carolyn McGill, Judah Rosner, Joshua 
Filter, Robert Weisberg, and Jeffry Roberts for careful reading 
of parts of or the entire manuscript. Helen Wilson is thanked 
for help with the illustrations. The work at the University of 
Michigan was, in part, supported by Public Health Research 
Grant A111459-10. 


References 

1. Adhya, S., and M. Gottesman. 1978. Control of transcription 
termination. Annu. Rev. Biochem. 47:967-996. 

2. Adhya, S., M. Gottesman, and B. De Crombrugghe. 1974. 
Release of polarity in Escherichia coli by gene N of phage 
lambda: termination and antitermination of transcription. 
Proc. Natl. Acad. Sci. USA 71:2534-2538. 

3. Altieri, A. S., M. J. Mazzulla, D. A. Horita, R. H. Coats, P. T. 
Wingfield, A. Das, D. L. Court, and R. A. Byrd. 2000. The 
structure of the transcriptional antiterminator NusB from 
Escherichia coli. Nat. Struct. Biol. 7:470-474. 

4. Antao, V P., S. Y. Lai, and I. Tinoco Jr. 1991. A thermody¬ 
namic study of unusually stable RNA and DNA hairpins. 
Nucleic Acids Res. 19:5901-5905. 

5. Atkinson, B. L., and M. E. Gottesman. 1992. The Escherichia 
coli rpoB60 mutation blocks antitermination by coliphage 
HK022 Q-function. J. Mol. Biol. 227:29-37. 

6. Banik-Maiti, S., R. A. King, and R. A. Weisberg. 1997. 
The antiterminator RNA of phage HK022. J. Mol. Biol. 
272:677-687. 

7. Barik, S., and A. Das. 1990. An analysis of the role of host 
factors in transcription antitermination in vitro by the 0 
protein of coliphage lambda. Mol. Gen. Genet. 222:152-156. 

8. Barik, S., B. Ghosh, W. Whalen, D. Lazinski, and A. Das. 
1987. An antitermination protein engages the elongating 
transcription apparatus at a promoter-proximal recogni¬ 
tion site. Cell 50:885-899. 

9. Borukhov, S., V Sagitov, and A. Goldfarb. 1993. Transcript 
cleavage factors from E. coli. Cell 72:459-466. 

10. Brennan, C. A., A. J. Dombroski, and T. Platt. 1987. Tran¬ 
scription termination factor rho is an RNA-DNA helicase. 
Cell 48:945-952. 

11. Burns, C. M., L. V Richardson, and J. P. Richardson. 1998. 
Combinatorial effects of NusA and NusG on transcription 
elongation and Rho-dependent termination in Escherichia 
coli. J. Mol. Biol. 278:307-316. 



98 PART II: LIFE OF PHAGES 


12. Burova, E., and M. E. Gottesman. 1995. NusG overexpres¬ 
sion inhibits Rho-dependent termination in Escherichia 
coli. Mol Microbiol. 17:633-641. 

13. Bycroft, M.,T. J. P. Hubbard, M. Proctor, S. M.V Freund, and 
A. G. Murzin. 1997. The solution structure of the SI RNA 
binding domain: a member of an ancient nucleic acid¬ 
binding fold. Cell 88:235-242. 

14. Cai, Z., A. Gorin, R. Frederick, X. Ye, W. Hu, A. Majumdar, 
A. Kettani, and D. J. Patel. 1998. Solution structure of P22 
transcriptional antitermination N peptide-boxB RNA 
complex. Nat. Struct. Biol. 5:203-212. 

15. Campbell, A. M. 1961. Sensitive mutants of bacteriophage 
lambda. Virology 14:22-32. 

16. Chan, C. L., and R. Landick. 1989. The Salmonella typhimur- 
ium his operon leader region contains an RNA hairpin- 
dependent transcription pause site. Mechanistic implica¬ 
tions of the effect on pausing of altered RNA hairpins. 
J. Biol. Chem. 264:20796-20804. 

17. Chan, C. L., and R. Landick. 1993. Dissection of the his 
leader pause site by base substitution reveals a multipartite 
signal that includes a pause RNA hairpin. J. Mol. Biol. 
233:25-42. 

18. Chattopadhyay, S., J. Garcia-Mena, J. DeVito, K.Wolska, and 
A. Das. 1995. Bipartite function of a small RNA hairpin in 
transcription antitermination in bacteriophage lambda. 
Proc. Natl. Acad. Sci. USA 92:4061-4065. 

19. Chattopadhyay, S., S. C. Hung, A. C. Stuart, A. G. Palmer 
3rd, J. Garcia-Mena, A. Das, and M. E. Gottesman. 1995. 
Interaction between the phage HK022 Nun protein and 
the nut RNA of phage lambda. Proc. Natl. Acad. Sci. USA 
92:12131-12135. 

20. Chou, S. H., L. Zhu, and B. R. Reid. 1997. Sheared 
purine x purine pairing in biology. J. Mol. Biol. 
267:1055-1067. 

21. Cilley, C. D., and J. R. Williamson. 1997. Analysis of bacter¬ 
iophage N protein and peptide binding to boxB RNA using 
polyacrylamide gel coelectrophoresis (PACE). RNA 3:57-67. 

22. Clerget, M., D. J. Jin, and R. A.Weisberg. 1995. A zinc-bind¬ 
ing region in the (3' subunit of RNA polymerase is involved 
in antitermination of early transcription of phage HK022. 
J. Mol. Biol. 248:768-780. 

23. Conaway, J. W., A. Shilatifard, A. Dvir, and R. C. Conaway. 
2000. Control of elongation by RNA polymerase II. Trends 
Biochem. Sci. 25:375-380. 

24. Court, D. 1993. RNA processing and degradation by RNase 
III, pp. 71-116. In G. Brawerman, and J. Belasco (eds.) 
Control of mRNA Stability. Academic Press, New York. 

25. Court, D. L.,T. A. Patterson, N. Baker, N. Costantino, C. Mao, 
and D. I. Friedman. 1995. Structural and functional analysis 
of the transcription-translation proteins NusB and NusE. J. 
Bacteriol. 177:2589-2591. 

26. Craven, M. G., and D. 1. Friedman. 1991. Analysis of the 
Escherichia coli nusAlO (Cs) allele: relating nucleotide 
changes to phenotypes. J. Bacteriol. 173:1485-1491. 

27. Craven, M. G., A. E. Granston, A. T. Schauer, C. Zheng, T. A. 
Gray, and D. I. Friedman. 1994. Escherichia coli-Salmonella 
typhimurium hybrid nusA genes: identification of a short 
motif required for action of the X N transcription antitermi¬ 
nation protein. J. Bacteriol. 176:1394-1404. 


28. Dambly, C., M. Couturier, and R. Thomas. 1968. Control of 
development in temperate bacteriophages. II. Control of 
lysozyme synthesis. J. Mol. Biol. 32:67-81. 

29. Das, A. 1992. How the phage lambda N gene product 
suppresses transcription termination: communication of 
RNA polymerase with regulatory proteins mediated by 
signals in nascent RNA. J. Bacteriol. 174:6711-6716. 

30. Das, A. 1993. Control of transcription termination by 
RNA-binding proteins. Annu. Rev. Biochem. 63:893-930. 

31. Das, A., S. Barik, B. Ghosh, and W. Whalen. 1996. 
Immunoprinting: a technique used to study dynamic 
protein-nucleic acid interactions within transcription 
elongation complex. Methods Enzymol. 274:363-374. 

32. Das, A., M. E. Gottesman, J. Wardwell, P. Trisler, and 
S. Gottesman. 1983. A mutation in the Escherichia coli rho 
gene that inhibits the N protein activity of phage lambda. 
Proc. Natl. Acad. Sci. USA 80:5530-5534. 

33. Das, A., M. Pal, J. G. Mena,W.Whalen, K.Wolska, R. Crossley, 
W. Rees, P. H. von Hippel, N. Costantino, D. Court, 
M. Mazzulla, A. S. Altieri, R. A. Byrd, S. Chattopadhyay, 
J. DeVito, and B. Ghosh. 1996. Components of multiprotein- 
RNA complex that controls transcription elongation in 
Escherichia coli phage lambda. Methods Enzymol. 
274:374-402. 

34. Das, A., and K. Wolska. 1984. Transcription antitermina¬ 
tion in vitro by lambda N gene product: requirement for 
a phage nut site and the products of host nusA, nusB, 
and nusE genes. Cell 38:165-173. 

35. Dasgupta, S., L. Fernandez, L. Kameyama, T. Inada, 
Y. Nakamura, A. Pappas, and D. L. Court. 1998. Genetic 
uncoupling of the dsRNA-binding and RNA cleavage 
activities of the Escherichia coli endoribonuclease RNase 
III: the effect of dsRNA binding on gene expression. Mol. 
Microbiol. 28:629-660. 

36. De Crombrugghe, B., M. Mudryj, R. DiLauro, and 
M. Gottesman. 1979. Specificity of the bacteriophage 
lambda N gene product (pN): nut sequences are necessary 
and sufficient for antitermination by pN. Cell 18:1145-1151. 

37. DeVito, J., and A. Das. 1994. Control of transcription 
processivity in phage lambda: Nus factors strengthen 
the termination-resistant state of RNA polymerase 
induced by N antiterminator. Proc. Natl. Acad. Sci. USA 
91:8660-8664. 

38. Doelling, J. H., and N. C. Franklin. 1989. Effects of all single 
base substitutions in the loop of boxB on antitermination of 
transcription by bacteriophage lambda’s N protein. Nucleic 
Acids Res. 17:5565-5577. 

39. Dove, W. F. 1966. Action of the lambda chromosome. 

I. Control of functions late in bacteriophage development. 

J. Mol. Biol. 19:187-201. 

40. Echols, H., D. Court, and L. Green. 1976. On the nature of 
cis-acting regulatory proteins and genetic organization in 
bacteriophage: the example of gene 0 of bacteriophage 
lambda. Genetics. 83:5-10. 

41. Faber, C., M. Scharpf, T. Becker, H. Sticht, and P. Rosch. 
2001. The structure of the coliphage HK022 Nun protein- 
lambda-phage boxB RNA complex. Implications for the 
mechanism of transcription termination. J. Biol. Chem. 
276:32064-32070. 



TRANSCRIPTION TERMINATION AND ANTITERMINATION 99 


42. Farnham, P. J., J. Greenblatt, and T. Platt. 1982. Effects of 
NusA protein on transcription termination in the trypto¬ 
phan operon of Escherichia coli. Cell 29:945-951. 

43. Forbes, D., and I. Herskowitz. 1982. Polarity suppression by 
the Q gene product of bacteriophage lambda. J. Mol. Biol. 
160:549-569. 

44. Franklin, N. C. 1985. Conservation of genome form but not 
sequence in the transcription antitermination determi¬ 
nants of bacteriophages lambda, phi 21 and P22. J. Mol. 
Biol. 181:75-84. 

45. Franklin, N. C. 1993. Clustered arginine residues of bacter¬ 
iophage lambda N protein are essential to antitermination 
of transcription, but their locale cannot compensate for 
boxB loop defects. J. Mol. Biol. 231:343-360. 

46. Franklin, N. C., and G. N. Bennett. 1979. The N protein of 
bacteriophage lambda, defined by its DNA sequence, is 
highly basic. Gene 8:107-119. 

47. Franklin, N. C., andj. H. Doelling. 1989. Overexpression of 
N antitermination proteins of bacteriophages lambda, 
phi21, and P22: loss of N protein specificity. J. Bacteriol. 
171:2513-2522. 

48. Friedman, D. 1.1988. Regulation of phage gene expression 
by termination and antitermination of transcription, 
pp. 263-319. In R. Calendar (ed.) The Bacteriophage, vol. 2. 
Plenum Press, New York. 

49. Friedman, D. I., M. Baumann, and L. S. Baron. 1976. 
Cooperative effects of bacterial mutations affecting lambda 
N gene expression. I. Isolation and characterization of a 
nnsB mutant. Virology 73:119-127. 

50. Friedman, D. I., and D. L. Court. 1995. Transcription anti- 
termination: the lambda paradigm updated. Mol. Microbiol. 
18:191-200. 

51. Friedman, D. 1., and D. L. Court. 2001. Bacteriophage 
lambda: alive and well and still doing its thing. Curr. Opin. 
Microbiol. 4:201-207. 

52. Friedman, D. 1., and M. Gottesman. 1983. Lytic mode of 
lambda development, pp. 21-51. In R. W. Hendrix, J. W. 
Roberts, F.W. Stahl, and R. A.Weisberg (eds.) Lambda If. Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. 

53. Friedman, D. I., and E. R. Olson. 1983. Evidence that a 
nucleotide sequence, “boxA," is involved in the action of the 
NusA protein. Cell 34:143-149. 

54. Friedman, D. I., E. R. Olson, C. Georgopoulos, K. Tilly, 

I. Herskowitz, and F. Banuett. 1984. Interactions of 
bacteriophage and host macromolecules in the growth 
of bacteriophage lambda. Microbiol. Rev. 48:299-325. 

55. Friedman, D. I., E. R. Olson, L. L. Johnson, D. Alessi, and 
M. G. Craven. 1990. Transcription-dependent competition 
for a host factor: the function and optimal sequence of the 
phage lambda boxA transcription antitermination signal. 
Genes Dev. 4:2210-2222. 

56. Friedman, D. I., A. T. Schauer, M. R. Baumann, L. S. Baron, 
and S. L. Adhya. 1981. Evidence that ribosomal protein S10 
participates in control of transcription termination. Proc. 
Natl. Acad. Sci. USA 78:1115-1118. 

57. Ghosh, B., and A. Das. 1984. nusB: a protein factor neces¬ 
sary for transcription antitermination in vitro by phage 
lambda N gene product. Proc. Natl. Acad. Sci. USA 
81:6305-6309. 


58. Ghosh, B., E. Grzadzielska, P. Bhattacharya, E. Peralta, 

J. DeVito, and A. Das. 1991. Specificity of antitermination 
mechanisms. Suppression of the terminator cluster T1-T2 
of Escherichia coli ribosomal RNA operon, rrnB, by phage 
lambda antiterminators. J. Mol. Biol. 222:59-66. 

59. Ghysen, A., and M. Pironio. 1972. Relationship between 
the N function of bacteriophage lambda and host RNA 
polymerase. J. Mol. Biol. 65:259-272. 

60. Gibson, T. J., J. D. Thompson, and J. Heringa. 1993. The 
KH domain occurs in a diverse set of RNA-binding 
proteins that include the antiterminator NusA and is 
probably involved in binding to nucleic acid. FEBS Lett. 
324:361-366. 

61. Gill, S. C., S. E.Weitzel, and P. H. von Hippel. 1991. Escherichia 
coli sigma 70 and NusA proteins. I. Binding interactions 
with core RNA polymerase in solution and within the 
transcription complex. J. Mol. Biol. 220:307-324. 

62. Gill, S. C.,T. D. Yager, and P. H. von Hippel. 1991. Escherichia 
coli sigma 70 and NusA proteins. II. Physical properties 
and self-association states. J. Mol. Biol. 220:325-333. 

63. Goda, Y., and J. Greenblatt. 1985. Efficient modification of E. 
coli RNA polymerase in vitro by the N gene transcription 
antitermination protein of bacteriophage lambda. Nucleic 
Acids Res. 13:2569-2582. 

64. Gopal, B., L. F. Haire, R. A. Cox, M. J. Colston, S. Major, 

J. A. Brannigan, S. J. Smerdon, and G. Dodson. 2000. The 
crystal structure of NusB from Mycobacterium tuberculosis. 
Nat. Struct. Biol. 7:475-478. 

65. Gopal, B., L. F. Haire, S. J. Gamblin, E. J. Dodson, A. N. Lane, 

K. G. Papavinasasundaram, M. J. Colston, and G. Dodson. 
2001. Crystal structure of the transcription elongation/ 
anti-termination factor NusA from Mycobacterium tubercu¬ 
losis at 1.7 A resolution. J. Mol. Biol. 314:1087-1095. 

66. Gopal, B., K. G. Papavinasasundaram, G. Dodson, M. J. 
Colston, S. A. Major, and A. N. Lane. 2001. Spectroscopic 
and thermodynamic characterization of the transcription 
antitermination factor NusE and its interaction with 
NusB from Mycobacterium tuberculosis. Biochemistry 
40:920-928. 

67. Gottesman, M. E., and R. A. Weisberg. 1995. Termination 
and antitermination of transcription in temperate bacterio¬ 
phage. Semin. Virol. 6:35-42. 

68. Gottesman, S., M. Gottesman, J. E. Shaw, and M. L. Pearson. 
1981. Protein degradation in E. coli: the Ion mutation and 
bacteriophage lambda N and ell protein stability. Cell 
24:225-233. 

69. Grayhack, E. J., X. J. Yang, L. F. Lau, and J. W. Roberts. 
1985. Phage lambda gene Q antiterminator recognizes 
RNA polymerase near the promoter and accelerates it 
through a pause site. Cell 42:259-269. 

70. Greenblatt, J. 1992. Protein-protein interactions as 
critical determinants of regulated initiation and termina¬ 
tion of transcription, pp. 203-226. In S. L. McKnight, and 
K. R. Yamamoto (eds.) Transcriptional Regulation. Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. 

71. Greenblatt, J., and J. Li. 1981. Interaction of the sigma factor 
and the nusA gene protein of E. coli with RNA polymerase 
in the initiation-termination cycle of transcription. Cell 
24:421-428. 



100 PART II: LIFE OF PHAGES 


72. Greenblatt, J., and J. Li. 1982. Properties of the N gene 
transcription antitermination protein of bacteriophage 
lambda. J. Biol. Chem. 257:362-365. 

73. Greenblatt, J., T. F. Mah, P. Legault, J. Mogridge, J. Li, 
and L. E. Kay. 1998. Structure and mechanism in 
transcriptional antitermination by the bacteriophage 
lambda N protein. Cold Spring Harb. Symp. Quant. Biol. 
LXIIL327-336. 

74. Greenblatt, J., J. R. Nodwell, and S. W. Mason. 1993. 
Transcriptional antitermination. Nature 364:401-406. 

75. Gross, C. A., C. Chan, A. Dombroski, T. Gruber, M. Sharp, 
J. Tupy, and B. Young. 1998. The functional and regulatory 
roles of sigma factors in transcription. Cold Spring Harb. 
Symp. Quant. Biol. LXIII:141-155. 

76. Gusarov, I., and E. Nudler. 2001. Control of intrinsic 
transcription termination by N and NusA: the basic 
mechanisms. Cell 107:437-449. 

77. Henthorn, K. S., and D. I. Friedman. 1996. Identification of 
functional regions of the Nun transcription termination 
protein of phage HK022 and the N antitermination protein 
of phage gamma using hybrid nun-N genes. J. Mol. Biol. 
257:9-20. 

78. Herskowitz, I., and D. Hagen. 1980. The lysis-lysogeny 
decision of phage lambda: explicit programming and 
responsiveness. Annu. Rev. Genet. 14:399-445. 

79. Heus, H. A., and A. Pardi. 1991. Structural features that give 
rise to the unusual stability of RNA hairpins containing 
GNRA loops. Science 253:191-194. 

80. Hilliker, S., and D. Botstein. 1975. An early regulatory gene 
of Salmonella phage P22 analogous to gene N of coliphage 
lambda. Virology 68:510-524. 

81. Horwitz, R. J.. J. Li, and J. Greenblatt. 1987. An elongation 
control particle containing the N gene transcriptional 
antitermination protein of bacteriophage lambda. Cell 
51:631-641. 

82. Huang, A.. J. Friesen, and J. L. Brunton. 1987. Characteriza¬ 
tion of a bacteriophage that carries the genes for produc¬ 
tion of Shiga-like toxin I in Escherichia coli. J. Bacteriol. 
169:4308-4312. 

83. Hung, S. C., and M. E. Gottesman. 1995. Phage HK022 Nun 
protein arrests transcription on phage X DNA in vitro and 
competes with the phage X N antitermination protein. 
J. Mol. Biol. 247:428-442. 

84. Hung, S. C., and M. E. Gottesman. 1997. The Nun protein 
of bacteriophage HK022 inhibits translocation of Escheri¬ 
chia coli RNA polymerase without abolishing its catalytic 
activities. Genes Dev. 11:2670-2678. 

85. Ito, K., and Y. Nakamura. 1993. Pleiotropic effects of 
the rpoCIO mutation affecting the RNA polymerase (3' 
subunit of Escherichia coli on factor-dependent transcrip¬ 
tion termination and antitermination. Mol. Microbiol. 
9:285-293. 

86. Jin, D. J., M. Cashel, D. I. Friedman, Y. Nakamura, 
W. A. Walter, and C. A. Gross. 1988. Effects of rifampicin 
resistant rpoB mutations on antitermination and 
interaction with nusA in Escherichia coli. J. Mol. Biol. 
204:247-261. 

87. Joyner, A., L. N. Isaacs, H. Echols, and W. S. Sly. 1966. DNA 
replication and messenger RNA production after induction 


of wild-type lambda bacteriophage and lambda mutants. 
J. Mol. Biol. 19:174-186. 

88. Juhala, R. J., M. E. Ford, R. L. Duda, A. Youlton, G. F. Hatfull, 
and R. W. Hendrix. 2000. Genomic sequences of bacterio¬ 
phages HK97 and HK022: pervasive mosaicism in the 
lambdoid phages. J. Mol. Biol. 299:27-51. 

89. Kainz, M., and J. Roberts. 1992. Structure of transcription 
elongation complexes in vivo. Science 255:838-841. 

90. Kameyama, L., L. Fernandez, D. L. Court, and 
G. Guarneros. 1991. RNaselll activation of bacterio¬ 
phage lambda N synthesis. Mol. Microbiol. 5:2953-2963. 

91. Keppel, F., C. P. Georgopoulos, and H. Eisen. 1974. Host 
interference with expression of the lambda N gene 
product. Biochimie 56:1505-1509. 

92. King, R. A., S. Banik-Maiti, D. J. Jin, and R. A. Weisberg. 
1996. Transcripts that increase the processivity and elon¬ 
gation rate of RNA polymerase. Cell 87:893-903. 

93. King, R. A.. P. L. Madsen, and R. A. Weisberg. 2000. Consti¬ 
tutive expression of a transcription termination factor by 
a repressed prophage: promoters for transcribing the 
phage HK022 nun gene. J. Bacteriol. 182:456-462. 

94. Ko, D. C., M. T. Marr, J. Guo, and J. W. Roberts. 1998. A 
surface of Escherichia coli sigma 70 required for promoter 
function and antitermination by phage lambda Q protein. 
Genes Dev. 12:3276-3285. 

95. Komissarova, N., and M. Kashlev. 1997. RNA polymerase 
switches between inactivated and activated states by 
translocating back and forth along the DNA and the 
RNA. J. Biol. Chem. 272:15329-15338. 

96. Korzheva, N., A. Mustaev, E. Nudler, V Nikiforov, and A. 
Goldfarb. 1998. Mechanistic model of the elongation 
complex of Escherichia coli RNA polymerase. Cold Spring 
Harb. Symp. Quant. Biol. LXIIL337-435. 

97. Landick, R. 1997. RNA polymerase slides home: pause and 
termination site recognition. Cell 88:741-744. 

98. Lau, L. F., and J. W. Roberts. 1985. Rho-dependent tran¬ 
scription termination at lambda tRl requires upstream 
sequences. J. Biol. Chem. 260:574-584. 

99. Lazinski, D., E. Grzadzielska, and A. Das. 1989. Sequence- 
specific recognition of RNA hairpins by bacteriophage 
antiterminators requires a conserved arginine-rich 
motif. Cell 59:207-218. 

100. Lecocq. J. , and C. Dambly. 1976. A bacterial RNA polymer¬ 
ase mutant that renders lambda growth independent of 
the N and cro functions at 42 degrees C. Mol. Gen. Genet. 
145:53-64. 

101. Legault, P. ]. Li, J. Mogridge, L. E. Kay, and J. Greenblatt. 
1998. NMR structure of the bacteriophage lambda N 
peptide/boxB RNA complex: recognition of a GNRA fold 
by an arginine-rich motif. Cell 93:289-299. 

102. Lew, K., and S. Casjens. 1975. Identification of early 
proteins coded by bacteriophage P22. Virology 68:525- 
533. 

103. Li. J., R. Horwitz, S. McCracken, and J. Greenblatt. 1992. 
NusG, a new Escherichia coli elongation factor involved in 
transcriptional antitermination by the N protein of phage 
lambda. J. Biol. Chem. 267:6012-6019. 

104. Li, J., S. W. Mason, and J. Greenblatt. 1993. Elongation 
factor NusG interacts with termination factor rho to 



TRANSCRIPTION TERMINATION AND ANTITERMINATION 101 


regulate termination and antitermination of transcrip¬ 
tion. Genes Dev. 7:161-172. 

105. Liu, IC, and M. M. Hanna. 1995. NusA interferes with 
interactions between the nascent RNA and the 
C-terminal domain of the alpha subunit of RNA 
polymerase in Escherichia coli transcription complexes. 
Proc. Natl. Acad. Sci. USA 92:5012-5016. 

106. Liu, K.,Y. Zhang, K. Severinov, A. Das, and M. M. Hannah. 
1996. Role of Escherichia coli RNA polymerase alpha 
subunit in modulation of pausing, termination and anti- 
termination by the transcription elongation factor NusA. 
EMBOJ. 15:150-161. 

107. Lozeron, H. A., J. E. Dahlberg, and W. Szybalski. 1976. 
Processing of the major leftward mRNA of coliphage 
lambda. Virology 71:262-277. 

108. Luttgen, H., R. Robelek, R. Muhlberger, T. Diercks, S. C. 
Schuster, P. Kohler, H. Kessler, A. Bacher, and G. Richter. 
2002. Transcriptional regulation by antitermination. 
Interaction of RNA with NusB Protein and NusB/ 
NusE protein complex of Escherichia coli. J. Mol. Biol. 
316:875-885. 

109. Mah, T. F., J. Li, A. R. Davidson, and J. Greenblatt. 1999. 
Functional importance of regions in Escherichia coli elon¬ 
gation factor NusA that interact with RNA polymerase, 
the bacteriophage lambda N protein and RNA. Mol. 
Microbiol. 34:523-537. 

110. Marr, M. T., S. A. Datwyler, C. F. Meares, and J. W. Roberts. 
2001. Restructuring of an RNA polymerase holoenzyme 
elongation complex by lambdoid phage Q proteins. Proc. 
Natl. Acad. Sci. USA 98:8972-8978. 

111. Marr, M. T., and J.W. Roberts. 2000. Function of transcrip¬ 
tion cleavage factors GreA and GreB at a regulatory pause 
site. Mol. Cell 6:1275-1285. 

112. Mason, S.W., J. Li, andj. Greenblatt. 1992. Direct interac¬ 
tion between two Escherichia coli transcription antitermi¬ 
nation factors, NusB and ribosomal protein S10. J. Mol. 
Biol. 223:55-66. 

113. Mason, S. W., J. Li, and J. Greenblatt. 1992. Host factor 
requirements for processive antitermination of transcrip¬ 
tion and suppression of pausing by the N protein of bacter¬ 
iophage lambda. J. Biol. Chem. 267:19418-19426. 

114. Mogridge, J., P. Legault, J. Li, M. D. Van Oene, L. E. Kay, and 
]. Greenblatt. 1998. Independent ligand-induced folding 
of the RNA-binding domain and two functionally distinct 
antitermination regions in the phage lambda N protein. 
Mol. Cell 1:265-275. 

115. Mogridge, J..T.-F. Mah, and J. Greenblatt. 1995. A protein- 
RNA interaction network facilitates the template- 
independent cooperative assembly on RNA polymerase 
of a stable antitermination complex containing the X N 
protein. Genes Dev. 9:2831-2844. 

116. Mogridge, J.,T. F. Mah, and J. Greenblatt. 1998. Involvement 
of boxA nucleotides in the formation of a stable ribonu- 
cleoprotein complex containing the bacteriophage 
lambda N protein. J. Biol. Chem. 273:4143-4148. 

117. Mooney, R. A., I. Artsimovitch, and R. Landick. 1998. 
Information processing by RNA polymerase: recognition 
of regulatory signals during RNA chain elongation. 
J. Bacteriol. 180:3265-3275. 


118. Nakamura,Y., S. Mizusawa, A. Tsugawa, and M. Imai. 1986. 
Conditionally lethal nusAts mutation of Escherichia coli 
reduces transcription termination but does not affect 
antitermination of bacteriophage lambda. Mol. Gen. 
Genet. 204:24-28. 

119. Neely, M. N., and D. I. Friedman. 1998. Functional and 
genetic analysis of regulatory regions of coliphage H-19B: 
location of Shiga-like toxin and lysis genes suggest a role 
for phage functions in toxin release. Mol. Microbiol. 
28:1255-1267. 

120. Neely, M. N., and D. I. Friedman. 2000. N-mediated 
transcription antitermination in lambdoid phage H-19B 
is characterized by alternative NUT RNA structures and 
a reduced requirement for host factors. Mol. Microbiol. 
38:1074-1085. 

121. Nehrke, K. W., and T. Platt. 1994. A quaternary transcrip¬ 
tion termination complex. Reciprocal stabilization by 
Rho factor and NusG protein. J. Mol. Biol. 243:830-839. 

122. Nodwell, J. R., and J. Greenblatt. 1993. Recognition of 
boxA antiterminator RNA by the E. coli antitermination 
factors NusB and ribosomal protein S10. Cell 72:261-268. 

123. Nudler, E. 1999. Transcription elongation: structural 
basis and mechanisms. J. Mol. Biol. 288:1-12. 

124. Oberto, J., M. Clerget, M. Ditto, K. Cam, and R. A.Weisberg. 
1993. Antitermination of early transcription in phage 
HK022. Absence of a phage-encoded antitermination 
factor. J. Mol. Biol. 229:368-381. 

125. Obuchowski, M., A. Wegrzyn, A. Szalewska-Palasz, 
M. S. Thomas, and G. Wegrzyn. 1997. An RNA polymerase 
alpha subunit mutant impairs N-dependent transcrip¬ 
tional antitermination in Escherichia coli. Mol. Microbiol. 
23:211-222. 

126. Olson, E. R., E. L. Flamm, and D. I. Friedman. 1982. 
Analysis of nutR: a region of phage lambda required for 
antitermination of transcription. Cell 31:61-70. 

127. Olson, E. R., C. S. Tomich, and D. I. Friedman. 1984. The 
nusA recognition site. Alteration in its sequence or posi¬ 
tion relative to upstream translation interferes with the 
action of the N antitermination function of phage 
lambda. J. Mol. Biol. 180:1053-1063. 

128. Pasman, Z., and P. H. von Hippel. 2000. Regulation of rho- 
dependent transcription termination by NusG is specific 
to the Escherichia coli elongation complex. Biochemistry 
39:5573-5585. 

129. Patterson, T. A., Z. Zhang, T. Baker, L. L. Johnson, 
D. I. Friedman, and D. L. Court. 1994. Bacteriophage 
lambda N-dependent transcription antitermination. 
Competition for an RNA site may regulate antitermina¬ 
tion. J. Mol. Biol. 236:217-228. 

130. Plunkett, G. 3rd, D. J. Rose.T. J. Durfee, and F. R. Blattner. 
1999. Sequence of Shiga toxin 2 phage 933 W from Escher¬ 
ichia coli 0157:H7: Shiga toxin as a phage late-gene 
product. J. Bacteriol. 181:1767-1778. 

131. Ptashne, M. 1992. A Genetic Switch, 2nd edn. Cell Press, 
Blackwell Scientific, Cambridge, Mass. 

132. Rees, W. A., S. E. Weitzel, T. D. Yager, A. Das, and 
P. H. von Hippel. 1996. Bacteriophage lambda N protein 
alone can induce transcription antitermination in vitro. 
Proc. Natl. Acad Sci. USA 93:342-346. 



102 PART II: LIFE OF PHAGES 


133. Richardson, J. P., and J. Greenblatt. 1996. Control of 
RNA chain elongation and termination, pp. 822-848. 
In E C. Neidhardt (ed.) Escherichia coli and Salmonella: 
Cellular and Molecular Biology. American Society for 
Microbiology, Washington, D.C. 

134. Ring, B. Z., and J.W. Roberts. 1994. Function of a nontran- 
scribed DNA strand site in transcription elongation. Cell 
78:317-324. 

135. Ring, B. Z.,W. S.Yarnell, and J.W. Roberts. 1996. Function 
of E. coli RNA polymerase sigma factor sigma 70 in 
promoter-proximal pausing. Cell 86:485-493. 

136. Robert, J., S. B. Sloan, R. A. Weisberg, M. E. Gottesman, 
R. Robledo, and D. Harbrecht. 1987. The remarkable speci¬ 
ficity of a new transcription termination factor suggests 
that the mechanisms of termination and antitermination 
are similar. Cell 51:483-492. 

137. Roberts, J. 1992. Antitermination and the control of 
transcription elongation, pp. 389-406. In S. L. McKnight 
and K. R. Yamamoto (eds.) Transcription Regulation. Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. 

138. Roberts, J. W. 1969. Termination factor for RNA synthesis. 
Nature 224:1168-1174. 

139. Roberts, J. W. 1975. Transcription termination and late 
control in phage lambda. Proc. Natl. Acad. Sci. USA. 
72:3300-3304. 

140. Roberts, J. W. 1976. Transcription termination and 
its control in E. coli. pp. 247-271. In R. Losick and 
M. Chamberlin (eds.), RNA Polymerase. Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, N.Y. 

141. Roberts, J. W. 1993. RNA and protein elements of E. coli 
and lambda transcription antitermination complexes. 
Cell 72:653-655. 

142. Roberts, J. W., C. W. Roberts, S. Hilliker, and D. Botstein. 
1976. Transcription termination and Regulation in 
bacteriophages P22 and lambda, pp. 707-718. In R. Losick 
and M. Chamberlin (eds.) RNA Polymerase. Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, N.Y. 

143. Roberts, J. W., W. Yarnell, E. Bartlett, J. Guo, M. Marr, 
D. C. Ko, H. Sun, and C. W. Roberts. 1998. Antitermination 
by bacteriophage lambda 0 protein. Cold Spring Harb. 
Syrnp. Quant. Biol. LXIIL319-325. 

144. Robledo, R., B. L. Atkinson, and M. E. Gottesman. 1991. 
Escherichia coli mutations that block transcription termi¬ 
nation by phage HK022 Nun protein. J. Mol. Biol. 
220:613-619. 

145. Robledo, R., M. E. Gottesman, and R. A. Weisberg. 1990. 
X nutR mutations convert HK022 Nun protein from a 
transcription termination factor to a suppressor of 
termination. J. Mol. Biol. 212:635-643. 

146. Rosenberg, M., D. Court, H. Shimatake, C. Brady, and 
D. L. Wulff. 1978. The relationship between function and 
DNA sequence in an intercistronic regulatory region 
in phage lambda. Nature 272:414-423. 

147. Ruteshouser, E. C., and J. P. Richardson. 1989. Identifica¬ 
tion and characterization of transcription termination 
sites in the Escherichia coli lacZ gene. J. Mol. Biol. 
208:23-43. 

148. Salstrom, J. S., and W. Szybalski. 1978. Coliphage lambda 
nutL - : a unique class of mutants defective in the site of 


gene N product utilization for antitermination of leftward 
transcription. J. Mol. Biol. 124:195-221. 

149. Scharpf, M., H. Sticht, K. Schweimer, M. Boehm, 
S. Hoffmann, and P. Rosch. 2000. Antitermination in 
bacteriophage lambda. The structure of the N36 peptide- 
boxB RNA complex. Eur. J. Biochem. 267:2397-2408. 

150. Schauer, A. T., D. L. Carver, B. Bigelow, L. S. Baron, and 
D. I. Friedman. 1987. Lambda N antitermination system: 
functional analysis of phage interactions with the host 
NusA protein. J. Mol. Biol. 194:679-690. 

151. Schauer, A. T., S. C. Cheng, C. Zheng, L. St. Pierre, D. Alessi, 
D. L. Hidayetoglu, N. Costantino, D. L. Court, and 
D. I. Friedman. 1996. The alpha subunit of RNA polymer¬ 
ase and transcription antitermination. Mol. Microbiol. 
21:839-851. 

152. Sen, R., R. A. King, N. Mzhavia, P. L. Madsen, and 
R. A. Weisberg. 2002. Sequence specific interaction of 
nascent antitermination RNA with the zinc finger motif 
of E. coli RNA polymerase Mol. Microbiol. 46:215-222. 

153. Sen, R., R. A. King, and R. A. Weisberg. 2001. Modification 
of the properties of elongating RNA polymerase by persis¬ 
tent association with nascent antiterminator RNA. Mol. 
Cell, 7:993-1001. 

154. Sharrock, R. A., R. L. Course, and M. Nomura. 1985. Defec¬ 
tive antitermination of rRNA transcription and derepres¬ 
sion of rRNA and tRNA synthesis in the nusB 5 mutant of 
Escherichia coli. Proc. Natl. Acad. Sci. USA 82:5275-5279. 

155. Sidorenkov, I., N. Komissarova, and M. Kashlev. 1998. 
Crucial role of the RNA: DNA hybrid in the processivity of 
transcription. Mol. Cell 2:55-64. 

156. Sigmund, C. D., and E. A. Morgan. 1988. NusA protein 
affects transcriptional pausing and termination in vitro 
by binding to different sites on the transcription complex. 
Biochemistry 27:5622-5627. 

157. Sloan, S. B., and R. A. Weisberg. 1993. Use of a gene encod¬ 
ing a suppressor tRNA as a reporter of transcription: 
analyzing the action of the Nun protein of bacteriophage 
HK022. Proc. Natl. Acad. Sci. USA 90:9842-9846. 

158. Smith, H.W.. P. Green, and Z. Parsell. 1983. Vero cell toxins 
in Escherichia coli and related bacteria: transfer by phage 
and conjugation and toxic action in laboratory animals, 
chickens and pigs. J. Gen. Microbiol. 129:3121-3137. 

159. Somasekhar, G., and W. Szybalski. 1983. Mapping of 
the O-utilization site ( qut ) required for antitermination 
of late transcription in bacteriophage lambda. Gene 
26:291-294. 

160. Somasekhar, G., and W. Szybalski. 1987. The functional 
boundaries of the Q-utilization site required for anti- 
termination of late transcription in bacteriophage 
lambda. Virology 158:414-426. 

161. Squires, C. L.. J. Greenblatt, J. Li, and C. Condon. 1993. 
Ribosomal RNA antitermination in vitro: requirement 
for Nus factors and one or more unidentified cellular 
components. Proc. Natl. Acad. Sci. USA 90:970-974. 

162. Su, L., J. T. Radek, L. A. Labeots, K. Hallenga, P. Hermanto, 
H. Chen, S. Nakagawa, M. Zhao, S. Kates, and M. A. Weiss. 
1997. An RNA enhancer in a phage transcriptional 
antitermination complex functions as a structural 
switch. Genes Dev. 11:2214-2226. 



TRANSCRIPTION TERMINATION AND ANTITERMINATION 103 


163. Sullivan, S. L., and M. E. Gottesman. 1992. Requirement for 
E. coli NusG protein in factor-dependent transcription 
termination. Cell 68:989-994. 

164. Sullivan, S. L., D. F. Ward, and M. E. Gottesman. 1992. 
Effect of Escherichia coli nusG function on lambda 
N-mediated transcription antitermination. ]. Bacteriol. 
174:1339-1344. 

165. Tan, R., and A. D. Frankel. 1995. Structural variety of 
arginine-rich RNA-binding peptides. Proc. Natl. Acad. 
Sci. USA 92:5282-5286. 

166. Taura.T., C. Ueguchi, K. Shiba, and K. Ito. 1992. Insertional 
disruption of the nusB (ssi/B) gene leads to cold-sensitive 
growth of Escherichia coli and suppression of the secY24 
mutation. Mol. Gen. Genet. 234:429-432. 

167. Van Gilst, M. R., W. A. Rees, A. Das, and P. H. von Hippel. 
1997. Complexes of N antitermination protein of phage 
lambda with specific and nonspecific RNA target sites on 
the nascent transcript. Biochemistry 36:1514-1524. 

168. Van Gilst, M. R., and P. H. von Hippel. 1997. Assembly of the 
N-dependent antitermination complex of phage lambda: 
NusA and RNA bind independently to different unfolded 
domains of the N protein. J. Mol. Biol. 274:160-173. 

169. Vogel, U., and K. F. Jensen. 1997. NusA is required for 
ribosomal antitermination and for modulation of the 
transcription elongation rate of both antiterminated 
RNA and mRNA. J. Biol. Chem. 272:12265-12271. 

170. Wagner, P. L., M. N. Neely, X. Zhang, D. W. Acheson, 
M. K. Waldor, and D. I. Friedman. 2001. Role for a phage 
promoter in Shiga toxin 2 expression from a pathogenic 
Escherichia coli strain. J. Bacteriol. 183:2081-2085. 

171. Ward, D. F., A. DeLong, and M. E. Gottesman. 1983. 
Escherichia coli nusB mutations that suppress nusAl 
exhibit lambda N specificity. J. Mol. Biol. 168:73-85. 

172. Ward, D. F., and M. E. Gottesman. 1981. The nus muta¬ 
tions affect transcription termination in Escherichia coli. 
Nature 292:212-215. 

173. Warren, F., and A. Das. 1984. Formation of termination- 
resistant transcription complex at phage lambda nut 
locus: effects of altered translation and a ribosomal 
mutation. Proc. Natl. Acad. Sci. USA 81:3612-3616. 

174. Watnick, R. S., and M. E. Gottesman. 1998. Escherichia coli 
NusA is required for efficient RNA binding by phage HK022 
nun protein. Proc. Natl. Acad Sci. USA 95:1546-1551. 

175. Watnick, R. S., and M. E. Gottesman. 1999. Binding of tran¬ 
scription termination protein nun to nascent RNA and 
template DNA. Science 286:2337-2339. 

176. Watnick, R. S., S. C. Herring, A. G. Palmer 3rd, and 
M. E. Gottesman. 2000. The carboxyl terminus of phage 
HK022 Nun includes a novel zinc-binding motif and a 
tryptophan required for transcription termination. Genes 
Dev. 14:731-739. 

177. Weisberg, R. A., and M. E. Gottesman. 1999. Processive 
antitermination. J. Bacteriol. 181:359-367. 

178. Weisberg, R. A., M. E. Gottesman, R. W. Hendrix, and 
J. W. Little. 1999. Family values in the age of genomics: 
comparative analysis of temperate bacteriophage HK022. 
Annu. Rev. Genet. 33:565-602. 

179. Whalen, W., B. Ghosh, and A. Das. 1988. NusA protein 
is necessary and sufficient in vitro for phage lambda N 


gene product to suppress a rho-independent terminator 
placed downstream of nutL. Proc. Natl. Acad. Sci. USA 
85:2494-2498. 

180. Whalen, W. A., and A. Das. 1990. Action of an RNA site 
at a distance: role of the nut genetic signal in transcrip¬ 
tion antitermination by phage-lambda N gene product. 
New Biol. 2:975-991. 

181. Wilson, H. R., L. Kameyama, J. G. Zhou, G. Guarneros, and 
D. L. Court. 1997. Translational repression by a transcrip¬ 
tional elongation factor. Genes Dev. 11:2204-2213. 

182. Wilson, H. R., D.Yu, H. K. Peters 3rd, J. G. Zhou, and D. L. 
Court. 2002. The global regulator RNase III modulates 
translation repression by the transcription elongation 
factor N. EMBO J. 21:4154-4161. 

183. Worbs, M., G. P. Bourenkov, H. D. Bartunik, R. Huber, and 
M. C. Wahl. 2001. An extended RNA binding surface 
through arrayed SI and KH domains in transcription 
factor NusA. Mol. Cell 7:1177-1189. 

184. Yager, T. D., and P. H. Von Hippel. 1987. Transcript elonga¬ 
tion and termination in Escherichia coli, pp. 1241-1275. 
In F. C. Neidhardt, J. L. Ingraham, B. Magasanik, K. B. Low, 
M. Schaechter, and H. E. Umbarger (eds.), Escherichia coli 
and Salmonella typhimurium: Cellular and Molecular 
Biology. ASM, Washington, D.C. 

185. Yang, X. J., C. M. Hart, E. J. Grayhack, andJ.W. Roberts. 1987. 
Transcription antitermination by phage lambda gene Q 
protein requires a DNA segment spanning the RNA start 
site. Genes Dev. 1:217-226. 

186. Yang, X. J., andJ.W. Roberts. 1989. Gene 0 antiterminator 
proteins of Escherichia coli phages 82 and lambda 
suppress pausing by RNA polymerase at a rho-dependent 
terminator and at other sites. Proc. Natl. Acad. Sci. USA 
86:5301-5305. 

187. Yarnell, W. S., and J. W. Roberts. 1992. The phage lambda 
gene 0 transcription antiterminator binds DNA in the 
late gene promoter as it modifies RNA polymerase. Cell 
69:1181-1189. 

188. Yarnell, W. S., and J. W. Roberts. 1999. Mechanism of 
intrinsic transcription termination and antitermination. 
Science 284:611-615. 

189. Zheng, C., and D.I. Friedman. 1994. Reduced Rho- 
dependent transcription termination permits NusA- 
independent growth of Escherichia coli. Proc. Natl. Acad. 
Sci. USA 91:7543-7547. 

190. Zhou, Y., J. J. Filter, D. L. Court, M. E. Gottesman, and 
D. I. Friedman. 2002. A Requirement for NusG for tran¬ 
scription antitermination in vivo by the X N protein. 
J. Bacteriol. 184:3416-3418. 

191. Zhou,Y.,T. F. Mah, J. Greenblatt, and D. I. Friedman. 2002. 
Evidence that the KH RNA-binding domains influence 
action of the E. coli NusA protein. J. Mol. Biol. 
318:1175-1188. 

192. Zhou, Y„ T. F. Mah, Y. T. Yu, J. Mogridge, E. R. Olson, 
J. Greenblatt, and D. I. Friedman. 2001. Interactions of 
an Arg-rich region of transcription elongation protein 
NusA with NUT RNA: implications for the order of 
assembly of the lambda N antitermination complex 
in vivo. J. Mol. Biol. 310:33-49. 



10 


Phage Lysis 

RY YOUNG 
ING-NANG WANG 


Timing Is Everything 

In general, bacteriophages must lyse their host cells to 
liberate the progeny virions. The most common misconcep¬ 
tion about host lysis is that it is simply the inevitable 
outcome of phage infection, a kind of sad denouement 
that follows after the elegant processes defining successive 
waves of gene expression and the intricate and coordinated 
pathways leading to virion morphogenesis. For example, 
in the otherwise captivating description of the molecular 
processes of phage life cycles in The Molecular Biology of 
the Gene (112), it is explained that some phage proteins 
are needed early, “while others, such as the viral coat 
proteins or the lysozyme for bacterial cell lysis, must only 
appear later,” leading to the end of the infective cycle of 
phage T 7. “Finally, T 7 lysozyme destroys the bacterial cell 
wall and allows newly formed phage particles to escape 
and renew their growth cycle in other bacteria.” The impres¬ 
sion given is that lysis is caused by the late appearance 
of lysozyme activity, despite the fact that T 7 lysozyme is 
an early gene. Also, the implication is that release and 
reinfection is needed to “renew” a completed vegetative 
program, that lysis occurs after virion production had been 
achieved. However, it has been known for decades that if 
the lysis genes are ablated, the accumulation of intracellular 
virions can continue unabated for extended periods, far 
beyond the time of lysis of parental phage (67). In fact, a 
recent re-analysis of the kinetics of virion accumulation 
for lysis-defective mutant phages indicated that it is essen¬ 
tially linear for periods at least 4-fold longer than the wild- 
type vegetative cycle (110). Thus lysis is a programmed event, 
like all the other steps in the infectious cycle. It is the thesis 
here that the decision of when to terminate the infection 
and lyse the host is the only major decision made in the 
vegetative cycle; that is, that all other macromolecular 
processes proceed at rates which maximize the intracellular 
accumulation of infective virions. 


Lysis Systems: At Least Two Strategies 

Most bacteria have a murein cell wall that constitutes the 
principal barrier to host lysis. Compromising the cell wall is 
thus the fundamental goal for lytic processes. Two general 
strategies to accomplish lysis have been described (121). 
Phages with double-stranded nucleic acid genomes use the 
“holin-endolysin” strategy. In this scheme, the phage elabo¬ 
rates a murein-degrading enzyme, an endolysin (also known 
as lysozyme), specifically dedicated to lysis, and a second, 
membrane-embedded protein, the holin, which serves to 
activate the endolysin at a precisely defined time. This is 
the basis of the programmed nature of host lysis. As the 
principal scheduling component, the holin is subject to 
elaborate regulatory schemes in real time and to exacting 
evolutionary pressures. This chapter will focus heavily on 
the remarkable properties of holins, which can be consid¬ 
ered the smallest and simplest biological timers. 

In contrast, small single-stranded nucleic acid phages, 
with very limited genomes, achieve lysis without encod¬ 
ing a muralytic enzyme. For phage biologists, this “lysis- 
without-lysozyme” has been a long-standing mystery that 
now, for at least two such phages, has been solved, at least 
in outline. In the cases of the ssDNA phage <f>X174 and the 
ssRNA phage QP, a single phage protein, for which we have 
proposed the term “amurin" (14), causes lysis by acting as a 
specific inhibitor of an enzyme in the multi-step pathway of 
murein biosynthesis (10, 12, 14). Because this kind of host 
lysis requires continued cell growth and appears to involve 
ruptures of the wall at the developing septum, just as in the 
case of cell wall antibiotics like penicillin, the amurins can 
be considered “protein antibiotics.” Remarkably, the enzy¬ 
matic steps in the murein synthesis pathway that are inhib¬ 
ited by the 4>X174 and OP amurins are different, and there 
is evidence that other small phages attack other steps 
(T. G. Bernhardt, B. McIntosh, and R. Young, unpublished 
data), raising the prospect that in the largely unexplored 
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universe of small phages, there may be an amurin specific 
for many of the steps in the widely conserved pathway of 
peptidoglycan synthesis. 

Holin-Endolysin Lysis: The Universal 
Strategy for Complex Phages 

Why Did Holins Evolve? 

In principle, lysis requires only that the cell wall be at 
least partly degraded, or that its synthesis be corrupted 
during growth. Thus, to achieve lysis a phage needs only 
to produce a muralytic enzyme with a secretory signal 
sequence or to generate a polypeptide that blocks murein 
biosynthesis. However, it appears that all phages with 
sufficient genomic space have evolved a more sophisticated 
lysis system, involving at least one other protein, the holin, 
which imposes a strict temporal program on the mura¬ 
lytic enzyme, the endolysin. The universality of the holin 
program, ensuring that lysis occurs at a programmed time 
despite the undiminished capacity of the infected cell for 
continued virion production, must reflect an evolutionary 
adaptation. Theoretical analysis suggests that, for any speci¬ 
fied host and environmental scenario, there is an ideal 
or optimal lysis time (1, 110; chapter 5). Irrespective of 
the details of the calculation, the clarifying perspective is 
simple to state: continued linear accumulation of virions 
in the infected host must be balanced against the potential 
for exponential accumulation if the progeny virions are 
released to pursue new prey. Thus a lysis system in which 
the programmed time of lysis can be easily adjusted in 
response to the imposition of different physiological and 
environmental conditions would contribute strongly to 
fitness. This adjustability could be reflected in terms of 
genetic variability of the timing, such that most missense 
changes in the timing gene would cause profound changes 
in the lysis time, allowing alleles with the appropriate lysis 
timing to arise rapidly, or it might take the form of real-time 
regulation of the lysis event, using environmental signals as 
inputs to key the programming of lysis. 

What characteristics would be appropriate for a sys¬ 
tem evolved to effect host lysis at a defined time? First, 
the lysis system should be as saltatory as possible; that 
is, the lysis system should not affect the productivity of 
the infected host, in terms of virion assembly, until the 
programmed time of lysis. Second, the lysis system should 
be very efficient and rapid once the infective cycle is termi¬ 
nated. That is, the time between the cessation of host macro- 
molecular synthesis and the bursting of the cell should 
be minimal; there is no profit from dwell time in a corpse. 
Finally, the timer should be capable of being over-ridden 
in real time, should circumstances change during an 
infection. For example, if environmental signals are received 
indicating a lack of available hosts in the environment, 


the timer should be blocked from triggering lysis so that 
at least the continued linear accumulation of virions 
can be maintained. Conversely, if there is a sudden loss 
of energy or a superinfection by a heterologous phage, 
the timer should be triggered to allow immediate lysis 
(“bail out”). 

All these characteristics are exhibited by the holin- 
endolysin system of lysis and are mostly due to the pro¬ 
perties of holins alone. 

Definitions and Significance 

Holins are small, phage-encoded membrane proteins 
which accumulate in the cytoplasmic membrane of the 
host. At a precise time programmed into its primary struc¬ 
ture, the holin suddenly causes disruption of the membrane, 
or “hole formation.” This terminology was chosen deliber¬ 
ately because the word “hole" has an appropriately vague 
connotation not connected with a specific membrane 
structure. Also, the use of this term avoids confusion with 
porins, which are well-defined proteins in the outer 
membrane of Gram-negative bacteria. Hole formation leads 
directly, and usually very rapidly, to destruction of the 
cell wall by the phage-encoded muralytic enzymes, or endo- 
lysins. Although recent evidence suggests that considering 
endolysins as passive, uninteresting reporter enzymes for 
holin activity may be too simplistic (see Emerging Perspec¬ 
tives, below), much of this chapter is devoted to document¬ 
ing the molecular basis for the remarkable capacities of 
holins, the simplest of biological “timers.” Again, this choice 
of terminology is calculated, because holins are not biologi¬ 
cal clocks, in the classic sense. Biological clocks run conti¬ 
nuously and may be referenced by many developmental 
processes, whereas holins run only once, act on their own 
to impose a precise temporal limit on the phage infective 
cycle, and, insofar as is known, have no other effect on the 
sequencing of the macromolecular events of virion morpho¬ 
genesis. This conclusion in no way diminishes the impor¬ 
tance or significance of holins in the biosphere. Phage 
predation of bacteria accounts for half or more of the flux of 
carbon through the worlds total marine biomass, where 
50 Gt of C is fixed annually (44, 45, 113). As noted above, 
the only actively controlled variable in lytic growth is 
the length of the vegetative cycle; thus, this most ancient 
temporal regulation is a profoundly important factor for 
life on Earth. As befits ancient proteins with such a heavy 
responsibility, holins are incredibly diverse, perhaps the 
most diverse group of proteins with common function 
in biology. 

It is worth noting that the term “endolysin” is used for 
phage muralytic enzymes because of its place in a histori¬ 
cally important debate over whether the enzyme activities 
that appear during viral infections are virus-encoded or 
induced from the host chromosome (36). Moreover, the 
more familiar name lysozyme has been used as the name for 
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the muralytic enzymes of X, T4, and T7, despite the fact that 
the three proteins have different enzymatic activities and 
only theT4 lysozyme has the same enzymatic action as the 
generic egg white lysozyme. 

The X Paradigm 

At the time of the textbook citation above, phage genetics 
had already revealed that the muralytic enzyme was not 
the only required lysis protein. In three different systems— 
X (49, 57), P 22 (35), and T4 (67)—phage mutants had been 
identified which not only failed to cause lysis but continued 
to accumulate virions and lysozyme intracellularly far 
beyond the wild-type lysis time. These mutations defined X 
S, its close relative P22 13, and the unrelated T4 t, all of 
which encode holins. The phenotypes highlight a character¬ 
istic unique to holin genes: they are the only genes with 
greatly increased production of virions as part of their null 
phenotypes (111)! 

The paradigm holin-endolysin lysis system is that 
of phage X. The X lysis cassette, immediately downstream 
of the late promoter, has an unusually rich architecture, 
consisting of three tandem cistrons, SRRz (figure 10-1A), 
which give rise to five protein products. One of the extra 
products is encoded by a fourth cistron, Rzl, which is embed¬ 
ded in an alternate reading frame within Rz; Rz-Rzl is 
thought to constitute a protein complex that somehow 
attacks the outer membrane. S encodes two proteins—the 
holin and the antiholin —as a result of translational initia¬ 
tion sites defined by codons 3 and 1, respectively. R encodes 
the X endolysin, which has transglycosylase activity. 
The five proteins holin, endolysin, antiholin, Rz, and Rzl 
represent the complete complement of lysis proteins. Here 
their roles in lysis and what is known about the molecular 
mechanisms will be described by first establishing what is 
known about the X system and then comparing other 
less well-characterized systems to it. 

The Lysis Physiology of X 

It is most revealing to focus first on the null phenotypes of 
the holin and endolysin (figure 10-2A). Lysis phenotypes 
can be reproducibly assessed by inducing a lysogen carrying 
a thermosensitive X prophage under carefully controlled 
physiological conditions. Because of the synchronicity 
inherent in the sudden inactivation of the thermolabile X 
repressor, S alleles with triggering times differing by 1-2 
minutes are easily distinguishable. An even more conveni¬ 
ent version of this, in terms of genetic manipulation, is a 
lysogen with a XA(SR) prophage carrying a medium copy 
plasmid with the promoter-proximal segment of the X late 
transcriptional unit, including the promoter, p' R . and the 
lysis cassette; in this system, induction of the lysis-defective 
prophage also causes trans-activation of the plasmid-borne 
lysis genes, resulting in approximately the same lysis timing 


as found in induction of a lysis-proficient prophage (31,100). 
In figure 10-2A, key physiological properties of the X lysis 
system, revealed using these kinds of assay systems on null 
mutants of the lysis genes, are shown: 

1. S~ allows continued increase in culture mass (and, 
consequently, intracellular virions), whereas R~ halts 
growth at the normal lysis time; 

2. addition of CHClj results in instant lysis with S~, but 
not R~, 

3. addition of energy poisons triggers lysis with S + R + , 
but not S~R + . 

These observations are completely paralleled in experiments 
with phage T4 t holin and e endolysin mutants (119). The 
interpretation is that the holin triggers at a defined time to 
cause a lethal lesion that makes the membrane barrier 
permeable to the endolysin, which otherwise accumulates 
in the cytosol. Moreover, the holin can be artificially trig¬ 
gered with an energy poison, or replaced by chloroform- 
mediated dissolution of the membrane. Figure 10-2B shows 
that the triggering time is allele-specific for S', single-residue 
changes in many different positions in S can result in earlier 
or later lysis times (also see figure 10-1B). These are all alleles 
of S105, an S cistron in which codon 1 has been eliminated to 
ablate production of the antiholin and thus simplify the 
interpretation of the timing phenotypes. 

The X Endolysin and Endolysins in General 

X R is an 18 kDa soluble transglycosylase, which cleaves the 
glycosidic bond between N-acetylglucosamine and N-acetyl- 
muramic acid resiues, forming a cyclic product (15). Its crys¬ 
tal structure reveals that it is evolutionarily related to true 
lysozymes, which hydrolyze the same bond (43). R appears 
to be dependent on the holin for crossing the mem¬ 
brane, although at about 3 hours after induction some 
R-dependent lysis of XSmn-infected cells does begin to occur 
(R.Young, unpublished data). Whether this reflects an intrin¬ 
sic membrane-penetrating ability of R or low-level access 
to the sec system is unclear. The N and C termini of R are 
surface-exposed in crystal structures (43), a structural 
feature exploited to construct fusion proteins for probing 
the size of the “hole” (see below). 

The SI05 Holin: Genetics and Biochemistry 

The S holin, or S105, is a 105-residue integral cytoplasmic 
membrane protein with three transmembrane domains 
(TMDs) (52). Wild-type S protein is present at about 10 3 
copies per cell at lysis (33). Nothing is known about its dis¬ 
tribution in the membrane but, assuming lnm packing 
diameter for each helical TMD, the total amount of S can 
only occupy about 3000 nnr, or 0.03%, of the membrane 
surface if clustered, but could span 3000 nm, approximately 
the cylindrical circumference of the cell, if arranged as an 
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Figure 10-1 Bacteriophage lysis genes. A: The lysis genes, with length in codons, of four lambdoid (X, P22, PS3, 21) and three 
non-lambdoid (PI, P2, T7) phages are shown: holin genes (vertical lines), endolysin genes (solid gray or black shading), 
nested Rz (horizontally lined-Rz/ (striped) genes (96, 111), and the antiholin genes lydB of PI and lysA of P2 (left-slanted 
lines) (124). The Rz-Rzl genes in P2 and T7 are called lysB-lysC and 18.5-18.7, respectively. Identical decoration schemes for 
any two boxes denotes that there is detectable sequence similarity. Among the endolysins, the enzymatic activities are: 
black = muramidase (lysozyme): light gray = transglycosylase; dark gray = amidase. White boxes are other phage genes not 
involved in lysis. In the case of the lambdoid holin genes with dual-start motifs, the lengths of the holin and antiholin reading 
are frames, in codons, are shown above each box. The names of the X lysis genes (S, R, Rz, and Rzl) are used for the lambdoid 
genes; for the Salmonella phages P22 and PS3, the corresponding gene names would be 13, 19, 15 and Rzl, respectively. Note 
that in T7, the endolysin gene, 3.5, is located in an early gene cluster; its position in the lysis cassette is taken by an unrelated 
virion maturation gene. The positions of late gene promoters serving the lysis genes are indicated by arrows. Shown in the 
inset above the X lysis cassette is the 5' end of the S mRNA sequence, where two stem-loop structures control the partition 
of translational initiations between the start codons of the SI07 and SI05 reading frames. Nucleotide positions relative to 
the first base of codon I are shown in italics. Adapted from Young (120) and Johnson-Boaz et al. (65), with permission. 

B: S mutants. The parental S sequence and characterized mutations are shown in single-letter code, with the starting 
residues of the SI 07 (Metl) and SI 05 (Met3) products underlined, predicted side-chain charge indicated above the sequence, 
and residues in the three transmembrane domains (TMDs) enclosed in gray boxes. All alleles shown are missense except that 
“x” denotes a nonsense allele. Keys; upper case = early lysis alleles; lower case = unconditional lysis-defective alleles; lower 
case italic = delayed iysis alleles; strike-through = lysis-neutral alleles. Underlined and strike-through missense alleles are all 
single-cysteine substitutions derived from the C51S allele. Adapted from Young (120), with permission. 
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Figure 10-2 Lysis phenotypes. A: Lysis of a X lysogen induced in exponential phase at t = 0. S + R + = triangles; 5 mM KCN 
added to S + R + = filled sguares; S + R~ = open diamonds; S~R + = circles S~R + = with CHC1 3 added, filled diamonds; 

S a 52 gR + = open sguares. Adapted from Wang et al. (Ill), with permission. B: Lysis phenotypes of single-cysteine substitution 
alleles. Exponential cells lysogenic for XA(SR) and carrying a plasmid with the X lysis cassette under pR' (also known as P R ') 
control were induced at t=0. All alleles except the null Sam7 are isogenic to the parental SI 05 allele (with codon 1, the 
initiation site for the S107 reading frame, inactivated). Stars indicate alleles with lysis times earlier than the S105 parental 
allele. Adapted from Grandling et al. (53), with permission. C: RNA structure control of the SI 05/SI 07 partition and lysis 
timing partition. Left: Western blot, using anti-S antibodies, of cytoplasmic membrane fractions from cells induced for 
various S alleles with mutations in the translational initiation region, including the dominant lysis-efective sdi mutations that 
defined the upstream regulatory RNA stem-loop (see figure 10-1 A, inset). The first three lanes have (a) S + , (b) SI07 and 

(c) S705; in the latter two alleles, the AUG start codons 3 and 1, respectively, are changed to CUG, eliminating the production 
of SI 05 and SI 07, respectively. Alleles in other lanes: (d) sd/3 (G_ g to A), (e) = sdil (C_ 1g to U), (f) G 3 to A, (g) inversion of sdi 
stem-loop by swapping UCCCC and GGGGG sequences, (h) UAAG (—6 to —3) replaced by GAGG, (i) C 33 to G, (j) combination 
of G_ 3 to A and C 33 to G, (k) U_ 24 to C, (I) C 36 to G and A 48 to C, and (m) both codons 1 and 3 to CUG. Mutations that weaken 
the sdi (upstream) stem-loop, strengthen the downstream loop or damage the Shine-Dalgarno sequence for Si05 —lanes 

(d) , (e), (f), and (I)—favor SI 07 production over SI 05. Conversely, strengthening the sdi structure, weakening the 
downstream loop or improving the SI05 Shine-Dalgarno sequence—lanes (h), (i), and (k)—favor SI 05 production over SI 07. 
Inversion of the sdi structure separates the S107 reading frame form its Shine-Dalgarno sequence and abolishes SI 07 
production, and changing both start codons to CUG (m) eliminates both S gene products. Adapted from Chang et al. (33), 
with permission. 


end-to-end polymer of adjacent TMDs. S105 inserts in 
the membrane in a sec-independent fashion, as judged by 
the insensitivity of S-mediated lysis to depletion of SecE 
(E. Ramanculov and R. Young, unpublished) and to shifts 
to the nonpermissive temperature in secA ts and sec hosts 
(91). Its N-terminus is blocked, presumably because it tra¬ 
verses the bilayer before deformylation can occur; stably 
solubilized in a zwitterionic detergent, it has a CD spectrum 
consistent with three helical TMDs (J. F. Deaton and R. Young, 
unpublished data). 

Many S mutants have been isolated, including not only 
unconditional lysis defectives but also a plethora of alleles 
which accelerate or delay the onset of lysis (figure 10-1B). 


Thus S satisfies the requirement for easy genetic adjusta¬ 
bility in terms of finding an ideal lysis time. Among the 
lysis-defectives, there are dominant, recessive, and “anti¬ 
dominant” (previously “early dominant”) alleles; the latter 
are lysis-defectives which accelerate lysis in the presence of 
the wild-type allele (88)! A particularly sensitive micro- 
domain appears to be in the middle of TMD2 (figure 10-1B). 
Both ^Sa 52 v an d XSasig are non-plaque-formers, but for 
opposite reasons. S A52 v is unconditionally lysis-defective, 
whereas S A52G supports catastrophically early lysis, at 
19 minutes, before the first virion is assembled and when 
the level of S protein is at about 10% of its normal level at 
lysis (65). Similarly, at Cys51, substitution of a Ser accelerates 


























PHAGE LYSIS 109 


lysis, whereas replacement with a Tyr abolishes lysis. Thus 
changes in bulk of only 16 or 14 Da at positions 51 and 
52, respectively, cause significant premature triggering of 
the holin. Although less drastic, lysis timing changes are 
found for single substitutions all the way around the helical 
surfaces of all three TMDs (figure 10-2B). This collection of 
single-Cys mutants was generated in an effort to map the 
TMDs of S by cysteine-modification accessibility, starting 
with the cysteine-less C51S allele (52, 53). Except for two 
residues in the C-terminal cytoplasmic domain, every Cys 
substitution caused a reproducible change in lysis time. 
Although most of the changes caused retarded lysis, several 
significantly accelerated the onset of lysis, beyond that 
caused by the original C51S change, including substitutions 
in both TMD1 and TMD3 (figure 10-1B, 10-2B). Thus every 
potential TMD surface of S appears to be involved in setting 
the lysis clock. At the molecular level, S has been shown to 
form oligomers by crosslinking studies, but the ultimate 
degree of oligomerization is unknown; some lysis-defective 
alleles appear to be blocked at the monomer, dimer, and 
oligomer stages (53). 

Although the sequences of the short periplasmic 
N-terminal domain and the cytoplasmic C-terminal domain 
are non-essential for lysis, the overall charge on each 
domain can affect timing dramatically. Especially with 
the C-terminus, increasing anionic or cationic character 
accelerates and retards the onset of lysis, respectively, and 
scrambling the sequence by a frameshift mutation is toler¬ 
ated (18, 101). Strangely, however, although a G 2 H 6 G 2 tag 
inserted near the membrane interface in the C-terminal 
cytoplasmic domain resulted in a functional, lytic S protein 
which has been the source of purified holin, the simple 
addition of a hexahistidine tag or bulky domains such 
as P-galactosidase or GFP in the C-terminal cytoplasmic 
domain disrupt holin function (101) (R. White, A. 
Gruendling, and R. Young, unpublished data). 

Purified S, or more properly, the purified hexahistidine- 
tagged variant of the S105 holin, can, at low efficiency, 
permeabilize liposomes loaded with a fluorescent dye, if 
diluted out of a chaotropic solution (99). This permeabiliza- 
tion fails with the S a 52 V variant and with a ts version, S a55t , 


at the nonpermissive, but not the permissive, temperature 
(J. F. Deaton and R. Young, unpublished data), suggesting 
this assay faithfully mimics the in vivo lytic event. More 
efficient delivery methods have been developed which should 
allow ultrastructural analysis of the holin-permeabilized 
liposomes by electron microscopy (38). 

The Antiholin SI07 

The RNA secondary structures and other sequence elements 
at the beginning of the S gene were engineered to produce 
different amounts, both relative and absolute, of the holin 
S105 and the antiholin S107. It was found that absence 
of the antiholin has only a small effect in accelerating 
the timing, but excess antiholin effectively abolishes lysis 
(88). Quantitative immunoblotting of membrane samples 
revealed that, in general, the onset of lysis varies inversely 
with the excess of S105 over S107 (33) (figure 10-2C). This 
suggests that S107, despite being identical with S105 for 
all but its two N-terminal residues, is effectively a lysis 
inhibitor which acts by titrating out S105. This notion is 
supported directly by crosslinking studies (55). Depolariza¬ 
tion of the membrane with an energy poison subverts 
the S107-mediated inhibition (17). The basic residue at 
position 2 is what distinguishes the inhibitor S107 from 
the holin S105; adding more basic residues at this position 
makes it a better inhibitor (51, 77, 101). These data suggest 
that the Lys2 positive charge retards the N-terminus of 
S107 from flipping up through the bilayer (figure 10-3), 
as has been demonstrated for the N-terminus of leader 
peptidase (29). This model is supported by the finding 
that fusing a signal sequence to S107 eliminates its inhib¬ 
itor character, presumably because it causes sec-mediated 
export of TMD1, and converts it into a functional holin if 
leader peptidase activity is present to cleave off the signal 
(figure 10-3) (50, 51). 

Despite the elegant design of the dual-start antiholin 
system, it is critical to emphasize that the holin S105 has 
an intrinsic timing function independent of the antiholin 
S107; the S105 allele, in which the Meti start codon of 
S107 is inactivated, is only marginally advanced in lysis 





Cytoplasm 

Figure 10-3 Membrane topologies of holins. A: SI 05. B: Putative SI 07 topology. C: sec-dependent chimeric Vlll-S gene 
product, showing leader peptidase cleavage site (51). D: S 21 . E: T4 T. Adapted from Wang et al. (Ill), with permission. 
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timing (52, 88). Models to explain the evolutionary advan¬ 
tages of having an antiholin are discussed below (Antiholin 
Diversity and Raison d’Etre). 

Rz and Rzl 

The Rz (153 codons) and Rzl (60 codons) genes are unique 
in biology: one segment of DNA occupied by two genes 
in alternate reading frames but responsible for the same 
phenotype (56, 123). In the presence of millimolar concen¬ 
trations of divalent cations, null alleles of Rz or Rzl, or 
their related P22 equivalents, cause lysis to terminate in 
metastable spherical cell forms (30, 122, 123). Rzl has 
a signal peptidase II cleavage site and is predicted to be a 
40-residue, proline-rich lipoprotein attached by its lipoyl 
moieties to the inner leaflet of the outer membrane, con¬ 
sistent with a species detected by palmitate labeling (68, 
104). Rz has not been identified, but it has a consensus 
signal peptidase I site and presumably is a periplasmic 
protein. Rzl am in the wrong suppressor background has 
a dominant negative phenotype that is partially cation- 
independent (123). 

RzRzl equivalents can be identified in the sequences of 
most other phages of Gram-negative bacteria (but not for 
phage T4), sometimes by sequence similarity, but also, as for 
the P2 lysBC genes, from their position immediately distal to 
an endolysin and by the unusual nested relationship with 
the smaller reading frame being proline-rich and containing 
a signal peptidase II site (figure 10-1A) (123). h/s/lam has 
a cation-independent lysis delay phenotype. lysBC, but not 
lysB or lysC alone, can complement Rzam Rzlam, which is 
consistent with the idea that Rz and Rzl form a functional 
complex (74). However, nothing is known about the func¬ 
tion of Rz-Rzl, although it has been suggested that Rz may 
be responsible for the endopeptidase activity originally 
ascribed to R (15). There does not seem to be a link between 
the type of endolysin activity and the presence of RzRzl 
genes, which are found in phages with three different types 


of endolysins: true lysozyme (P22), amidase (T7), and trans- 
glycosylase (X, P2) (figure 10-1A). The cation dependence of 
the phenotype in A, suggests that Rz-Rzl is involved with 
compromising the outer membrane, or its covalent links 
with the murein layer. However, it is also as likely that it 
contributes toward more efficient endolysin activity and 
thus is only needed (in X/F22 infections) when a cation- 
stabilized outer membrane can contribute to the residual 
structural integrity left after endolysin action. 

Diversity in Holin-Endolysin Systems 

Holin Diversity 

At the time of the most recent general review on holins, 
there were already 105 holin genes which could be sorted 
into 32 families with no detectable sequence similarity 
(111). Now these numbers have increased to more than 250 
holins in more than 50 gene families, mostly due to the 
acceleration of bacterial genomics and thus the parallel 
increase in prophage genomics. Table 10-1 has a summary, 
but a current and more comprehensive tabulation of holin 
genes and other associated lysis cistrons is available at 
www.thebacteriophages.org. Most can be grouped into two 
classes: class I, like X S, with threeTMDs, and the N-terminus 
out; and class II, like S 21 , with two TMDs, with both N and 
C termini in the cytoplasm. The T4 holin T forms a class 
of its own (class III), with its relatively large C-terminal 
periplasmic domain, and some of the others have uncertain 
topologies based on primary sequence analysis (figure 10-3). 
Very few holins and putative holins have been subjected to 
functional analysis, which is difficult outside the phage 
genomic context because of the constraints on cloning 
lethal genes in multicopy plasmids. Moreover, it is clear that 
assessing the biological function of a protein that has as its 
primary function a single, saltatory, and tightly scheduled 
lethal event depends on reproducing the normal level of 


Table 10-1 Summary of Lysis Cassette Diversity 


A. Holin class 

Gram-negative hosts 


Gram-positive hosts 

B. Total No. 
of families 

C. No. of families 
with secretory 
endolysins 

E. No. of families 

D. Total No. with secretory 

of families endolysins 

1 

10 

3 

9 

2 

II 

9 

6 

11 

6 

III 

1 

0 

0 

0 

7 

4 

1 

7 

1 

Total 

24 

10 

27 

9 


The data are summarized from the comprehensive table deposited at the website associated with this book, which was adapted from a similar website- 
deposited table, with permission (112). For this table, holin gene families (columns B and D) are defined as orthologous groups with sufficient sequence 
similarity to permit manual alignment. If at least one holin in the family is associated with an endolysin which has been shown to support sec-mediated 
externalization or has an apparent TMD or secretory signal sequence at its N-terminus, then the entire family is counted in columns C or E. For up-to-date 
references, please see the website www.thebacteriophages.org 
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expression. This is not straightforward in many bacterial 
backgrounds not as well characterized as E. coli. Moreover, 
a secretory or membrane protein sufficiently overproduced 
from a multicopy plasmid can insult the membrane suffi¬ 
ciently to cause release of the R endolysin (4). Also, there 
does not appear to be a correlation between the enzymatic 
activity of the endolysin and the class of holin; for example, 
in lambdoid phages, three different families of holins, includ¬ 
ing one class I and two class II, have been found with two 
enzymatically different kinds of endolysins (figure 10-1A). 

In a sense, this unprecedented diversity is an obstacle for 
computer analysis, because there are few examples of holin 
families with enough members to permit comparative analy¬ 
sis of critical residues (see table 10-1 and website table). An 
exception is the class II holin family represented by the 
holin of the lambdoid phage 21. Alignments of the members 
of this family suggest that total side chain bulk is conserved 
at different positions in the paired TMDs (figure 10-4), as 
though helical packing is a critical factor, a notion also 
suggested by mutational analysis of the X holin (see below). 

The profusion of different holin families makes it unlikely 
that holins form a large, complex, well-defined structure 
providing a passageway through the bilayer. Instead, it 
suggests that it is easy to evolve a holin. Indeed, no second¬ 
ary structure element is easier to evolve than aTMD; all that 
is required is roughly 16 residues of predominantly hydro- 
phobic character with no net charge. This hydrophobicity 
will direct the protein to the membrane, and the bilayer, 
lacking alternative H-bonding partners for the backbone 
amides and carbonyls, will impose a-helical structure on 
the domain. What else besides two or three TMDs is required 

TMD1 

.. 

21 MKSMDKI@TG|I]AYGTSAGSAGYWFLQWLDQVS 
Pa2 MKFMDKLTTGVAYGTSAGSAGYWFLQLLDKVT 

Dipl 2 mksmdklttgvaygtsagnagfw|a|lqlldkvt 
Qin mkfmdklttgvaygtsagnagfw|a|lqlldkvt 
933W MYQMEKITTGVSYTTSAV0TGYWFLQLLDRVS 
(|i80 myrmdklttgIaIaygIaI.sag.sii JngmI l InaI y. . .s 

TMD2 

innniiiiiiiiininiiiiiiiiiiiiiniiiiiiiiniiiiiiniiiiiiiiiiiiiiiiiiiiiiiiiiiiiifli 

psqwaaigvlgslvlgfltyltnlyfkiredrrkaarge. 

PSQWAAIGVLGSLVFGLLTYLTNLYFKIKEDKRKAARGE. 

PSQWAAIGVLGSLVFGLLTYLTNLYFKIKEDRRKAARGE. 

PSQWAAIGVLGSLVFGLLTYLTNLYFKIFEDRRKAARGE. 

PSQWAAIGVLGSLLFGLLTYLTNLYFKIKEDRRKAARGE. 

PEQWNAIGVLVOlI|I|AVMTYLTNLYFKIREDNRRSRSRDEPNVE 

Figure 10-4 Alignment of representatives of the S 21 class II 
holin family. Sequences are compared with S PA2 . Changes 
within either TMD resulting in increased or decreased side- 
chain bulk is indicated by shading or boxes, respectively. 
Dipl 2 and Qin are cryptic lambdoid prophages of E. coli K-12 
(90). PA2 is the lambdoid prophage of a porcine strain of 
E. coli (16). 933W is the lambdoid prophage carrying the 
Shiga-like toxin gene in the hemorrhagic E. coli strain 
0I57:H7 (86). 


to make a holin is not clear, although the available evidence 
suggests the ability to oligomerize is important (see How 
Do Holins Work?). 

Antiholin Diversity and Raison d’Etre 

The Dual-Start Motif 

As stated above, many holin genes have dual-start motifs, 
with two potential start codons separated by at least one 
codon for a basic amino acid. In the few other cases tested 
(P22 13, 21 S, and <(>29 14), eliminating codon 1 accelerates 
the onset of lysis, which is presumptive evidence that the 
dual start functions analogously to that of S (6, 77, 102). At 
least for S 21 , the holin gene of lambdoid phage 21, the 
longer product has been shown to have antiholin character, 
based on the N-proximal basic residue (6). In many of these 
cases, however, including S 21 , the holins are class II 
sequences (only two TMDs). Thus the molecular basis of the 
inhibitory function must be fundamentally different because 
the N-terminal domain of a class II holin is not available for 
transiting the bilayer (figure 10-3). 

Irrespective of the basis of the inhibitory character of the 
longer gene product of dual-start holin genes, at least in the 
two cases examined to date, it is clear that RNA secondary 
structure plays a key role in regulating the absolute and rela¬ 
tive rates of initiation at the two start codons. In X S, two 
stem-loop structures, one overlapping the strong consensus 
Shine-Dalgarno sequence for S and a second spanning 
codons 11-16, are involved in achieving the approximately 
1:2 partition of initiations at codons 1 and 3 (figure 10-1A). 
Weakening the upstream structure or strengthening the 
downstream structure leads to dominant lysis-defective S 
alleles producing an excess of S107; conversely, strengthen¬ 
ing the upstream structure or weakening the downstream 
structure increases the S105:S107 partition and accelerates 
the onset of lysis (21, 33, 77). A 60 kDa host protein binds 
the RNA immediately adjacent to the upstream stem-loop. 
Absence of this upstream sequence greatly reduces the 
translational efficiency of the S cistron, indicating that the 
host factor and the RNA may play a role in regulating S and 
also perhaps in the S105:S107 partition (32, 76). 

In S 21 , there is only an upstream stem-loop, and although 
it only partly occludes the Shine-Dalgarno sequence, weak¬ 
ening it again produces a partially dominant lysis-defective 
allele that produces only S/l, the antiholin (6). However, 
elimination of codon 1 greatly reduces initiations at 
codon 4, the start codon for S 21 68, the holin. and thus 
masks the early lysis phenotype expected from loss of 
the antiholin. Mutagenesis of the translational initiation 
region suggests that the partition of antiholin versus 
holin starts in S 21 results from initiation complexes forming 
over codon 1 and then frequently diffusing downstream to 
codon 4 (6). 
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Why Dual-Start Antiholins? 

Why do so many holin genes have a dual-start motif, and 
thus, presumably, encode an antiholin? In X, the short lysis 
delay associated with the normal level of S107 antiholin 
production could easily be achieved more simply by chang¬ 
ing a residue in the holin S105. First, of course, it seems 
likely that there may be physiological conditions where 
the partition of antiholin-holin starts shifts in favor of 
the former, and thus results in a very significant delay, if 
not complete blockage, in lysis. A quantitative model for lysis 
timing suggests that a reduction in the productive capacity 
of the host or in the availability of fresh hosts in the medium 
should lead to a selection for longer lysis times (110). The 
dual-start system could provide a phenotypic adaptation to 
such changed conditions, if it were programmed to detect 
the appropriate physiological signal and effect an inversion 
of the normal antiholin-holin ratio. 

Another perspective is that the S107-S105 system con¬ 
tributes to the saltatory nature of holin-mediated lysis. 
The wild-type S gene produces about twice as much S105 as 
S107. Because S105 preferentially dimerizes with S107, half 
the S105 is sequestered in heterodimers, resulting in two 
thirds of the total S protein in nonfunctional, or less func¬ 
tional, heterodimers (55). Whatever constitutes the sponta¬ 
neous triggering event would collapse the energized state 
of the membrane, thereby converting the nonfunctional 
heterodimers into functional holin dimers (figure 10-5). 
Thus at the instant of triggering, the functional holin 
level is suddenly tripled, presumably expediting the disrup¬ 
tion of the membrane barrier. This may be the best way 
to achieve a reasonable time delay, by reducing the rate 
of accumulation of functional holin, but also ensuring 
that the sufficient holin can be recruited at the instant 
of triggering to effect rapid release or activation of the 
endolysin. 


rl and Lysis Inhibition inT4 

Not all antiholins are alternate-start products of the same 
gene encoding the holin. The best studied independent 
antiholin is the RI protein of phage T4. Its gene, rl, was 
one of the four T4 plaque-morphology genes (the others 
being rllA, rllB, and rill) identified by Hershey in experi¬ 
ments that are regarded as among the founding works 
of modern molecular genetics (58). The phenomenon under¬ 
lying these phenotypes is called LIN, for “lysis inhibition.” 
In a T4 infection, the imm gene is expressed early: Imm is 
a cytoplasmic membrane protein which blocks entry of the 
genome of aT4 phage attempting to superinfect the infected 
host. Somehow, this aborted secondary infection leads to 
imposition of LIN, in which the normal scheduled lysis 
at 25 minutes after infection is blocked. LIN can persist 
indefinitely as long as secondary infection events continue 
to occur. The LIN state in every respect mimics the physiol¬ 
ogy of a holin-defective infection, in that intracellular 
virions and endolysin accumulate far beyond normal levels, 
except that the LIN collapses if the energized state of the 
membrane is subverted by an energy poison (2). 

An Ancient Player Unmasked: RI Is an 

Antiholin 

Despite its ancient and well-characterized physiology, 
and its historical importance, the mechanism of LIN 
was elusive for more than five decades, mainly because of 
the difficulties of manipulating the T4 infective cycle and 
the bewildering number of r loci identified over the years 
since Hershey’s initial work (2). Recently, analysis of this 
extensive literature led Paddison et al. (83) to conclude that 
among the original r loci only rl is required for LIN in all 
bacterial hosts. Moreover, another LIN-defect locus, rV, was 
shown to be allelic to t, suggesting that the r genes were 



Figure 10-5 Holin function and the dual-start motif. Holins and antiholins from dual-start holin genes are depicted as light 
and dark shaded rectangles, respectively. The cartoon reflects the normal 2:1 proportion of the holin S105 and S107 and 
assumes that dimerization is the first step in hole formation. According to this model, each SI 07, which preferentially 
heterodimerizes with SI 05 (55), removes one SI 05 from the pool of holin monomers involved in the timing of hole 
formation. At the instant of the first hole formation, the collapse of the membrane potential allows the SI 07 protein to 
assume the conformation of the SI 05 holin. Thus, the scheduled triggering results in tripling the amount of functional holin 
homodimers available for hole formation. Adapted from Grundling et al. (55), with permission. 
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a signaling system targeting the holin (42). Finally, it has 
been shown that LIN can be imposed upon the chimera 
Xt, in which S is replaced by t, by superinfecting with T4 
(89). LIN is not imposed upon the isogenic 2,S + when it is 
superinfected with T4. The imposition of the LIN state was 
found to depend on the allelic state of the rl and t genes of 
the superinfecting T4 phage, indicating that the LIN-sensi- 
tive state of the T holin is transient and suggesting that the 
antiholin acts by impeding entry of the holin into oligomeric 
“pre-hole” assemblies (89). Remarkably, using the same Xt 
system to test for LIN, the requirement for superinfection by 
T4 can be replaced by simple induction of a plasmid-borne 
copy of rl (98 a). Moreover, crosslinking in vivo with 
formaldehyde revealed that the antiholin Rl binds to the 
membrane-embedded holin. Thus, at last, the outlines 
of a molecular explanation for the principal sacrament of 
the Phage Church are apparent, but many questions 
remain. It is unclear why simple expression of rl from a plas¬ 
mid eliminates the need for secondary T4 infection. The key 
to this may lie in the instability of Rl, which has only been 
detected after cross-linking to T. Perhaps the ectopic peri- 
plasmic localization of the 169 kb genome of the superinfect¬ 
ing T4 particle, or proteins ejected from its capsid, somehow 
stabilize Rl, which appears to consist of a 6 kDa periplasmic 
domain tethered to the membrane by an N-terminal TMD 
(83). It is unknown how Rl recognizes T, and how its binding 
prevents T from entering the pathway to lysis triggering. The 
roles of rllA, rllB, and rill in LIN are also mysterious. 
Although none is essential in E. coli K-12 hosts, each has a 
dramatic LIN-defective phenotype in some backgrounds. 
Analysis of lysis phenotypes suggests that RIII, which is 
predicted to be a cytoplasmic protein, may stabilize the RI- 
T interaction (89). The ability to reconstruct LIN apart from 
the complexity of theT4 vegetative cycle promises that rapid 
progress in this venerable phenomenon may be at hand. 

Other Independent Antiholin Genes 

Genetic or physiological evidence for antiholin genes is 
available for the classical coliphages PI and P2. In PI, 
although the known lysis genes are all under control of the 
late gene activator, the holin and antiholin genes, lydAB, are 
adjacent but unlinked to the endolysin gene, lyz (formerly 
27), encoding aT4 E orthologue (figure 10-1A). Remarkably, 
the antiholin LydB is essential to obtain a productive burst; 
in its absence, lysis occurs catastrophically early (108). 
LydB is tethered to the membrane by a single N-terminal 
TMD, with a highly hydrophilic domain of about 14 kDa 
in the cytoplasm. Experiments with a XAS::lydA chimera 
have shown that LydB retards lysis mediated by LydA, a 
class I holin, but not S-mediated lysis, in the context of 
the X vegetative cycle (M. Xu, D. K. Struck and R. Young, 
unpublished data). In P2, the lysis cassette is comprised of 
genesYKlysAlysBlysC (124). Y, K and LysB/LysC are, respec¬ 
tively, a class I holin, an orthologue of the X R endolysin. 


and, as noted above, functional analogues of X Rz/Rzl. 
A null mutation in lysA causes a 5 minute acceleration of 
the onset of lysis, and overproduction of LysA in trans to an 
induced X chimera, with Yreplacing S, but not with induced 
X, causes a significant lysis delay (T. Park and R. Young, 
unpublished data), suggesting that LysA is a Y-specific anti¬ 
holin. LysA appears to be an integral membrane protein with 
four TMDs, and is predicted to have two basic residues at 
its cytoplasmic N-terminus, reminiscent of the antiholins 
of the class II holin genes (124). However, the molecular 
basis of the LysA-mediated lysis delay and the LysA-Y specifi¬ 
city is not known. 

Endolysin Diversity 

Four different enzyme activities have been associated with 
endolysins: “true lysozyme” (also known as muramidase; 
e.g.,T4 e lysozyme, P22 gpli lysozyme), which hydrolyzes 
the 1,4 glycosidic bonds in the murein; transglycosylase 
(e.g., X R or P2 K), which attacks the same bonds but 
forms a cyclic 1,6-anhydro-N-acetylmuramic acid product; 
amidase (e.g.,T7 gp3.5), which hydrolyzes the amide bond 
between the N-aceytlmuramic acid and L-alanine residues 
in the oligopeptide crosslinking chains; and endopeptidase 
(e.g., 4>11 lysin), which attacks the peptide bonds in the 
same chains (79, 119, 121). In some cases, more than one 
of these activities is found in the same protein (79). Again, 
whenever the notion has been tested, with important 
exceptions noted below (The Other Paradigm: sec-Mediated 
Lysis), endolysins of different enzymatic types complement 
each other. Nonspecificity with respect to holins is clearly 
seen by comparing X and P22, which have nearly identical 
class I holins but completely different endolysins, and 
also P22 and PS 34, which have nearly identical lysozymes 
orthologous to T4 E but unrelated class I and class II 
holins, respectively (figure 10-1A). Almost all endolysins 
are late proteins; a notable exception is T7 gp3.5, which 
is an early protein and has an important early function, 
as specific inhibitor of T7 RNA polymerase (34, 75). The 
holin-endolysin-RzRzl lysis gene cassette conserved at 
the start of the late gene transcript of all lambdoid phages 
is present in the late T 7 transcript, except that the endolysin 
gene is replaced by an unrelated gene involved in phage 
DNA maturation (figure 10-1A). 

Many endolysins of phages specific for Gram-positive 
hosts have a modular structure, with a C-terminal domain 
determining binding specificity exquisitely sensitive to the 
differences in the structure of the cell wall (37, 40, 41, 98). 
For example, the Cpll endolysin, a T4 lysozyme orthologue, 
from the pneumococcal phage Cpl has a C-terminal domain 
specific for the choline component of the teichoic acid 
of its host, whereas the lysozyme from phage Cpl 7, with a 
similar enzymatic domain, is choline-independent. Simi¬ 
larly, the Dpi endolysin, an amidase, has a C-terminal cell 
wall binding domain orthologous to the choline-dependent 
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domain of CpU. The specificity and lytic activity have been 
exploited in the use of a phage endolysin as an aerosol 
to ablate pneumococcal infections in the mouse naso¬ 
pharyngeal cavity, a development that highlights the possi¬ 
ble use of phage proteins as antibacterial factors (80). 

Emerging Perspectives and Questions 
about Holin-Endolysin Lysis 

The Other Paradigm: sec-Mediated Lysis 

A Dogma and Its Death 

Although the first comprehensive review of phage lysis 
did not appear until 1992, the clarity and elegance of the 
X paradigm rapidly imposed itself as a kind of dogma. The 
complementary phenotypes of two S missense alleles, S A 52 g 
and S A52 y with unconditional plaque-formation defects 
illustrate this point. The former causes lysis too early, before 
the first virion is assembled, whereas the latter is lysis- 
defective, indistinguishable from the null Sam7 in terms of 
the unabated accumulation of intracellular endolysin and 
virions (53, 88). This is formal genetic proof that, for X, the 
holin has two essential functions: to disrupt the membrane 
to allow escape of the endolysin to its murein substrate, and 
to effect the scheduled termination of the infective cycle. In 
some phage systems, however, holin genes are not essen¬ 
tial for plaque formation; for example, T 717.5 am mutants 
make plaques. Examination of this problematical phenotype 
emphasizes the point made above, that holin nulls increase 
virion production. Presumably, when T727.5am virions are 
plated on a lawn of sensitive cells, the lack of timely lysis 
in each infection is compensated by a greatly increased 
accumulation of virions per infection, such that when 
the slower and nonspecific insults to the general health 
and envelope integrity of the infected host associated with 
the aggressive T 7 vegetative cycle finally lead to membrane 
leakiness and lysis, the net result is formation of a plaque 
of substantial size. This perspective explains why holin 
genes had been identified as essential genes only in lamb- 
doid and T-even phage, where fortunately the nonspecific 
“rotting" of the infected cell at times long past the normal 
onset of lysis does not result in a plaque. 

Just as the X lysis paradigm was becoming clarified, at 
least in terms of its molecular components, a paradigm 
shift began with studies by Santos and colleagues on the 
phage fOg44, which grows on the Gram-positive bacterium 
Oenococcus oeni (85, 95). In this phage, the predicted product 
of the lys44 gene is a muramidase with an N-terminal 
signal sequence; moreover, Lys44 is processed by leader 
peptidase during infection (95), presumptive evidence 
that the endolysin is exported by the sec machinery. 
Nonetheless, fOg44 apparently has a holin gene, hol44 (85), 
and its co-expression with lys44 in the heterologous E. coli 


environment leads to more efficient lysis (95). A survey of 
orthologous endolysins from other phages of Gram-positive 
bacteria suggested that some, but not all, of these endolysins 
had N-terminal sequences resembling secretory signals, 
although in every case an adjacent holin gene orthologous 
to the putative hol44 was also present. Unfortunately, 
definitive physiological experiments with these systems 
to ascertain holin-negative phenotypes have not yet been 
practical. Nevertheless, it is clear that in these phages 
the holin is not required for export of the endolysin. The 
authors proposed that the exported, processed endolysin is 
either inactive, or less active, until the holin exerts its timed 
disruption of the membrane. According to this scheme, 
holin triggering would, at minimum, collapse the energized 
state of the membrane and activate the pre-localized endo¬ 
lysin. That this may be the mechanism of activation of the 
exported endolysin is suggested by the fact that, in Gram¬ 
positive bacteria, host muralytic enzymes are present in 
an inactive state in the cell wall and can be activated by 
energy poisons or anoxic conditions. Recent studies with 
pH-sensitive fluorescent dyes and murein modification 
reagents have indicated that the murein layer in Gram¬ 
positive bacteria is maintained at a much lower pH than 
the ambient medium, at least in respiring cells (27, 28, 66). 
Some endolysins have high pH optima, so the energization 
of the membrane may be directly responsible for maintain¬ 
ing the secreted endolysins in an inactive or less active state. 

Echos in Coliphage: Secreted Endolysins 

Sequence comprisons also reveal that a number of endo¬ 
lysins from phages of Gram-negative bacteria also have 
N-terminal sequences that could engage the sec system. 
There is a hydrophobic N-terminal extension on the 
predicted endolysins, all orthologs of T4 E, from coliphage 
PI and from the lambdoid phages 21, P22, PS119 and 
PS 3 (figure 10-6). The length of the hydrophobic sequence 
and the distribution of flanking charged residues are, 
however, indicative of a N-terminal TMD, a “signal anchor” 
or “uncleaved signal sequence,” rather than of a cleavable 
signal peptide like the endolysin in fOg44. Experiments 
with clones of PI lyz confirm that lysis, albeit somewhat 
delayed and less saltatory, follows after induction of 
this endolysin gene in the absence of a holin gene (118b). 
Moreover, cyanide treatment triggers early lysis and 
dramatically accelerates lysis already begun, whereas 1 mM 
azide, a specific inhibitor of SecA, blocks lysis. A detailed 
analysis of PI Lyz confirmed that it is initially localized 
as an N-terminally tethered, enzymatically inactive form, 
and then is activated by release from the membrane (118b). 
This release occurs spontaneously at a low rate, thus 
accounting for the holin-independent lysis supported 
by lyz clone, or quantitatively, when the membrane is 
depolarized artificially or by the triggering of the PI 
holin. The N-terminal TMD of Lyz was designated as a SAR 
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Figure 10-6 Signal-anchor-release (SAR) domains in endolysins of Gramnegative bacteria. The N-terminal sequence of the 
endolysins of the lambdoid phages 21 (22), and PS119 (GenBank accession CAA09710), P22 (30), and the non-lambdoid 
phages PI (30), T4 (81), and HP1 (7) are shown. The N-terminal SAR domain is underlined in the PI, HP1, and R 21 endolysins 
(118b). The catalytic triad correspnding to residues Ell, D20, and T26 of T4 lysozme are boxed (87). The dashed brackets 
arrow shows the disulfide linkage that inactivates membrane-tethered PI Lyz by occupying the catalytic Cys residue, and the 
solid brackets arrow shows the disulfide linkage in the active, periplasmic form of the enyzme (118a). 


(signal-anchor-release) domain because of its unprece¬ 
dented ability to escape from the bilayer. Further experi¬ 
ments revealed that the membrane-tethered form of Lyz 
was inactive because of disulfide bond occupying an 
active-site Cys residue, and that activation of the muralytic 
activity occurs when the free Cys in SAR domain, once 
liberated from the membrane, triggers a disulfide bond 
isomerization (118a). Structural analysis revealed that the 
isomerization is accompanied by a drastic conformational 
change that completely reorganizes tha catalytic cleft. Thus 
PI Lyz is regulated by topological, covalent, and conforma¬ 
tional constraints, in addition to being subject to the timing 
schedule of the PI holin-antiholin system. Other endo¬ 
lysins, like R 21 (figure 10-6) have SAR domains but do 
not have either a Cys residue in the membrane-embedded 
sequences or a Cys at the active site and thus must have 
an alternative way of regulating the exported muralytic 
activity. 

How Do Holins Work? 

The foregoing has promoted the idea that holins are 
remarkable proteins, honed by powerful evolutionary 
forces to be the simplest and most adjustable of, if not 
clocks, biological timers. Moreover, it also seems clear that 
holins are sensitive to the energy state of the membrane 
and thus, perhaps, constitute the simplest of gated 
membrane assemblies. The reader has no doubt noticed, 
however, that there is very little mechanistic insight into 
how holins work, however elegant and amusing the 
phenomenology. The two interesting, and unanswered, 
questions are: how do holins disrupt the membrane, and 
how do they accomplish this disruption according to a strict 
temporal program? 

How Big Is the “Hole”? 

Nothing is known about the nature of the membrane 
lesions. One experiment has been done to calibrate the 
scale of the “S-hole." Fusions of R were created with 
full-length lacZ, creating a gene encoding a hybrid R-(3gaI 


protein of 1189 residues (109). The fusion proved to be fully 
active and tetrameric, with a mass of nearly 0.5 MDa; 
remarkably, it is also fully active in lysis, with only a short 
delay after the holin triggering, presumably due to slower 
diffusion of the huge chimera in the periplasm. This result 
suggests that at least some of the lesions created by the trig¬ 
gering of S are of a size comparable to the large-scale lesions, 
in excess of 30 nm diameter, created by the thiol cytolysins 
such as streptolysin O (84, 97). Preliminary electron micro¬ 
graphy, using purified S to permeabilize liposomes, suggests 
that artificial membranes, at least, are completely disrupted 
by the holin, rather than permeabilized by the assembly 
of a regular pore structure (J. F. Deaton, C. Savva, J. Sun, 
A. Holzenburg and R. Young, unpublished data). 

Are All Holes, and Holins, Equal? 

If some holins are required only to effect a scheduled 
collapse of the membrane potential, thus activating pre¬ 
secreted endolysins, can they function to allow the fully 
folded, secretion-less endolysins such as 7. R or T4 E to 
attack the cell wall? Definitive experiments appropriately 
controlled for physiological levels of expression are not 
yet available. However, recent experiments have shown 
that premature triggering of the S holin with an energy 
poison allows lysis with R but not with a full-sized R-(3 
galactosidase, suggesting that the potential “hole size” 
increases with the accumulation of the holin in the 
membrane (109). An attractive, unifying notion is that 
all holins produce a range of lesion sizes, perhaps 
dependent on the population of different n-mer states in 
the membrane at the time of triggering, which, in turn, 
may be kinetically defined, rather than in an equilibrium 
state. The size of the lesion may simply reflect the size 
of pre-lesion aggregates in the membrane, which, in 
turn, could be related to the mechanism of spontaneous 
triggering and its temporal dependence. That is, significant 
membrane disruption, and thus the possibility of the release 
of pre-folded endolysins, may be an inevitable consequence 
of the scheduled triggering mechanism, rather than its 
main goal. 
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How Do Holins Keep Time? 

Ultimately, the most intriguing issue remains how holins 
function as such precise timers. Our understanding of 
this phenomenon is still at the stage of learning which 
questions can be answered. The S gene remains the para¬ 
digm. Two recent experiments help constrain the range 
of models. A common characteristic of all holins tested 
to date is the triggerability by energy poisons. One of 
the first models proposed for the timing capacity of the 
X holin had it that the spontaneous lysis mediated by S 
comes about because, as S accumulates in the membrane, it 
causes ion leakage and gradually titrates out the membrane 
potential. When the leakage became unsupportable and 
the energization of the membrane collapsed, hole forma¬ 
tion would immediately ensue, as though an energy poison 
had been added externally. This model was falsified by 
experiments using the system developed by H. Berg and 
colleagues to study the powering of the bacterial flagellum 
by the proton-motive force (pmf) (46). In these experiments, 
bacteria were tethered to a glass slide by antiflagellin anti¬ 
bodies and then their rotation speed monitored by video¬ 
microscopy; under appropriate conditions, the rotation 
speed of the tethered cell is proportional to the pmf over a 
wide range of voltages. By inducing a cloned X lysis gene 
cassette in the tethered cell, it was shown that the rate of 
cellular rotation is maintained up until the instant of trigger¬ 
ing; at this point, rotation is halted and, within seconds, the 
cells burst (figure 10-7) (54). Thus, holins kill without warn¬ 
ing; the hole-formation pathway does not involve intermedi¬ 
ates that affect the pmf or ion gradients at all. Moreover, by 
titrating the pmf with the uncoupler dinitrophenol, it could 
be shown that S triggered as soon as the pmf was reduced 
about 40%, which would be at about —100 mV assuming 
the normal pmf is —180 mV (54). Thus S is not only a simple 
biological timer but also one of the simplest “gated” mem¬ 
brane proteins, sensitive to changes of about 60 mV 

A simple alternative to the leakage model which may 
explain the timing of the saltatory hole formation is the 
“critical concentration" model. This model simply states that 
S accumulates in the membrane until a critical two- 
dimensional concentration or number density or mole 
fraction is achieved. At this point, the hole structure begins 
to form. If the new hole structure collapses the membrane 
potential, then the entire population of S molecules is trig¬ 
gered. A precedent for this is known in the formation of the 
two-dimensional crystalline array of bacteriorhodopsin in 
the purple membranes of halobacteria; bacteriorhodopsin 
monomers accumulate in the membrane until the critical 
concentration is reached, after which the crystal form accu¬ 
mulates and the monomer population stays constant (62,63). 
There are more sophisticated versions of this kind of model, 
depending on how many n-mer substrates there are before 
the hole-forming oligomer is formed. A simple prediction of 
this model is that if there are two different, non-interacting 


holins accumulating in the same cell, then the holin reach¬ 
ing its critical concentration first will dominate the timing of 
lysis; the presence of a holin which will reach its critical 
concentration later should be irrelevant. A preliminary test 
of this model has been done, using X S and T4 t mounted in 
prophage and trans-activatable plasmids under pR' control. 
Ignoring for the moment the unproven assumption that S 
and T do not interact, the results were unambiguous: lysis 
triggers much faster with both holin genes being expressed 
than with either of them expressed under isogenic condi¬ 
tions (89). Simply put, it seems that both holins were doing 
something to the cell, before lysis, and the effects of the two 
holins were additive. Several reservations must be attached 
to this conclusion, however. First, it is of course possible 
that the two holins do interact, despite the absolute lack 
of sequence similarity, or interact with a common protein 
target in the membrane. The failure to find any bacterial 
mutations that affect holin function, as opposed to holin 
accumulation, may simply be bad luck. Also, it has not been 
shown that the kill-without-warning feature of S is shared 
by T, so the early lysis might reflect T-mediated leakage 
leading to early triggering of S. 

In any case, with the perspective of the tethered cell 
experiments, the timing problem is even more sharply 




Figure 10-7 The phage X holin kills without warning. At t = 0, 
cells tethered to a microscope slide by antiflagellin anti¬ 
bodies were induced for the expression of the plasmid 
pSI05 A 52c, which carries the X lysis cassette, including the 
Sa 52 c early lysis allele, under the control of its cognate 
promoter, pR'. Single frames chosen from the recordings 
of a representative cell are depicted here to illustrate the 
saltatory nature of holin killing. Starting from the time 
point indicated to the left of the panel, single frames were 
captured every 200 ms. After induction of the lysis genes, 
the tethered cells rotate at high and constant speed (first 
row). About 20 min after induction, rotation of the cell 
abruptly slows and stops completely within 1-3 (second 
row). Cell lysis, due to digestion of the cell wall by the X R 
endolysin, occurs within several seconds after the sudden 
stop in rotation (third and fourth rows). Adapted from 
Grundling et al. (54), with permission. 
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defined. From an evolutionary viewpoint, these results are 
not surprising. S functions, as presumably all holins do, to 
terminate the infective cycle at an optimal time, consistent 
with the environment and the condition of the host. Sudden 
triggering and release or activation of excess endolysin 
without prior insult to the membrane serves this purpose 
perfectly. The capacity for macromolecular synthesis is 
undisturbed up until the instant of lysis, thus maximizing 
the burst size for the particular length of vegetative cycle 
and reducing the dwell time of finished virions in the 
nonproductive corpse. 

“Death Rafts”: A Model for Holin Function 

The extensive genetic analysis of S has provided clues to 
thinking about the timing issue: 

1. S105 alone exerts a precise timing regimen, so the S107 
antiholin is not directly involved in the core temporal 
regulation. 

2. S105 protein deleted for most of the highly charged 
cytoplasmic C-terminus retains a sharply defined 
lysis schedule. Thus, although increasing the balance 
of cationic charge in the cytoplasmic domain can 
retard the triggering time, this domain is not required 
for the intrinsic timing function (18). Similar analyses 
at the N-terminus are more difficult to interpret, 
because increasing the number of positively charged 
residues in this domain would in effect re-create a 
S107-equivalent antiholin protein (101). 

3. Mutations mapped to both of the interhelical loops and 
to all the helical surfaces of all three TMDs change 
the timing of lysis unpredictably, apparently at the 
level of function rather than synthesis or stability 
(52, 53). 

4. Mutant alleles defective in lysis have unpredictable 
phenotypes in the presence of the parental allele. 
Some are dominant, some are recessive, and some are 
"antidominant”; that is, some lysis-defectives, in the 
presence of S, advance the triggering time as much 
as a second parental allele (88). 

5. Lysis-defective alleles appear to be blocked either at 
the nronomer-dimer, dimer-oligomer, and oligomer- 
"hole” transitions (53). 

It is difficult to rationalize these findings with any model 
where a defined, regular structure is assembled in the 
membrane. Moreover, the unprecedented diversity in holin 
sequences and structures seems unlikely to reflect many 
different ways to make a sophisticated pore structure. The 
common functionality of all these proteins, differing not 
only in amino acid sequence but in membrane topology, 
suggests that evolving into a holin sequence must be 
relatively easy. Taken together, the phenotypic complex¬ 
ity and genetic diversity suggest that there must be a 
common fundamental property of S and other holins. It is 



Figure 10-8 The “death raft” model for holin function. 

A: Holin rafts. It is proposed that holins accumulate in rafts in 
the membrane, shown here in a schematic top-down view, 
and that intimate intermolecular and intramolecular helix¬ 
packing between the TMDs of the holins largely excludes 
lipids. Each circle represents a single holin molecule. 
Spontaneous formation of an aqueous channel by thermal 
fluctuation is depicted. The localized depolarization causes a 
conformational change in the holins leading to asymmetric 
disruption of the helix packing, exposure of a relatively 
hydrophilic surface, and dispersion of the subunits into the 
holin lesion. B: “Hole” size. Based on the raft model above, 
a rationale for the formation of different-sized lesions by 
holins, depicted in a cross-sectional schematic of the 
membrane. Large lesions would be formed by the triggering 
of the large rafts of holin #1, whereas smaller lesions are 
formed in #2, which could represent a mutant of holin 
#1 or a heterologous holin that normally functions with a 
sec-exported endolysm and thus is not required to form 
a large lesion to allow endolysin release. Adapted from 
Wang et al. (109), with permission. 

proposed here that this property is that holins are capable 
of accumulating, by intimate packing of their TMDs, in oligo¬ 
meric patches or rafts in the membrane in which the lipid is 
largely, if not completely, excluded from the protein-protein 
interfaces (figure 10-8). As these patches grow, either by 
merging of smaller rafts or by accretion of dimers into rafts, 
the forces on these patches change, dominated early by 
lipid-protein interactions but, as the average patch diameter 
increases, later by protein-protein interactions. The local 
surface tension, conductance, and capacitance properties 
of these patches are obviously very different from that of the 
normal membrane, especially considering the nearly mega¬ 
volt per centimeter voltage across the bilayer. Moreover, 
these properties would be expected to be influenced drama¬ 
tically by every change in every surface and loop of the S 
protein. The simplest model is that at some patch size, a 
portion of a raft momentarily gives way to dispersive forces, 
creating a local, microscopic ionic current and local depolar¬ 
ization. With functional S alleles, this would lead to a 
concerted local conformational change (perhaps in the aver¬ 
age angles from the orthogonal of the TMDs, for example) 
that shifted the balance in favor of dispersive forces. Thus an 
acute local collapse of the energized state of the membrane 
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would be propagated, rapidly triggering the collapse of all 
the rafts in the membrane. The terminal phenotype might 
be ragged holin-lined lesions in the membrane, perhaps 
reflecting the distribution of raft sizes at the instant of 
triggering. 

To take this kind of model seriously would require, at 
minimum, direct demonstration of large, two-dimensional 
arrays of S, a task made difficult by the extreme triggering- 
sensitivity of cells in which functional S is accumulating, 
where even brief anoxia imposed on the cells causes instant 
lysis. Also, if energized liposomes can be loaded with S 
protein labeled with fluorescent probes, changes in the 
physical state of S associated with sudden depolarization 
might be correlated with the extent and kinetics of mem¬ 
brane disruption, assessed by dye release and visualization 
with electron microscopy. 

Why Are There Antiholins? 

The existence of S107 as an antiholin for the X holin at first 
seems an elegant solution to the timing problem, until one 
contemplates the very minor phenotype of ablating the anti¬ 
holin. As noted above, under laboratory induction condi¬ 
tions, lysis timing is accelerated only about 5 minutes in 
mutants producing only S105 at its normal rate ( 88 ). It 
would be simple to find a missense allele of S with the 
slightly early lysis time, so why have the antiholin at all? In 
the PI system, the timing of lysis supported by the class I 
holin LydA in the absence of the antiholin LydB is so early 
that no virions have been assembled. However, even here, 
extrapolating from the canonical class I holin S, it should be 
easy to select a lydA mutant which triggers at the normal 
time, and thus eliminates the need for an antiholin. The 
answer may lie in the remarkable properties of theT4 anti¬ 
holin RI, which under certain circumstances is activated 
to override the normal timing schedule of the holin T. In 
this case, the antiholin provides a real-time adjustment for 
the evolutionarily selected holin timing schedule. One may 
suppose that there are conditions where the S107/S105 
ratio is altered and thus the normal timing of the parental 
S105 protein is overridden. In X, RNA structures define the 
proportion of holin and antiholin, and, as noted above, 
there is evidence for an RNA-binding host factor being 
involved in the translatability of the S mRNA. There may be 
environmental conditions which, through this host factor, 
determine the holin-antiholin production ratio and thus 
directly override the lysis timing programmed into the S105 
sequence. Similar structures are found in other dual-start 
motifs ( 22 ). 

Lysis Without Lysozyme 

It appears that all phages with double-stranded nucleic acid 
genomes use the holin-endolysin system for lysis. This must 


reflect a powerful evolutionary pressure for optimization of 
the lytic event, both in terms of efficiency of release of the 
progeny virions and, probably more so, in terms of terminat¬ 
ing the vegetative cycle at a time appropriate for the particu¬ 
lar host and environment. However, no lysozyme is encoded 
or produced by three classes of very simple lytic phages with 
small, single-stranded nucleic acid genomes: the ssDNA 
Microviridae and the ssRNA male-specific Levlviridae and 
Alloleviridae, the prototypes for which are 4>X174, MS2 and 
QP, respectively (figure 10-9). It has been a mystery for 
decades how these small viruses achieve lysis of the host 
and liberation of the progeny without producing a muralytic 
activity. In each case, a single phage gene was shown to be 
necessary and sufficient for host lysis: E for <j)X174, L for MS2 
and A 2 for QP (5, 8 , 60,114). 

Probably the two most heavily promoted models for 
lysis in these “single-gene” systems have been: 

1. Autolysis: according to this view, the viruses induce 
endogenous muralytic activities which, although 
functional in vivo, are not detectable by in vitro 
assays (19, 71, 72,107). 

2. Transmembrane tunnels: according to this model, prom¬ 
ulgated for 4>X174, the lysis protein E forms an oligo¬ 
meric tunnel structure through the entire envelope, 
allowing the cytoplasmic contents to escape (116-118). 

Recently, the molecular basis of the lytic function of the 
<j)X174 E protein and the OP A 2 protein has been established 
unequivocably, by a combination of genetic and biochemi¬ 
cal techniques. In both cases, the lysis protein has been 
shown to be an inhibitor of a specific enzyme in the pathway 
for cell wall synthesis (10, 12, 14). This may be regarded 
as the fundamental mechanism, although the sequelae 
of inhibition of cell wall synthesis, either by phage lysis 
protein or by fungal antibiotic, may very well be 
considered “autolysis,” in the sense that some host proteins 
are involved in the terminal phenotype. However, the 
transmembrane tunnel model is definitely ruled out, since E 
has been shown to be a specific enzyme inhibitor rather 
than a component of a pore structure spanning the 
envelope. 

cpX774 £ 

The cj)X174 genome has only 10 genes, and the lysis 
gene, E, spans only 91 codons in an alternate reading 
frame embedded in an essential gene D encoding a capsid 
scaffolding protein (figure 10-9). E is a membrane protein: 
its first 35 residues define the lytic function and the 
membrane insertion domain. Although its topology is 
uncertain, several lines of evidence indicate the C-terminus 
is cytoplasmic. Replacement of the entire C-terminal 
cytoplasmic domain by P-galactosidase, GFP, or any folded 
domain has no effect on lysis of wild-type hosts (20, 26, 73). 
E 35 -GFP fusions have been constructed which are fully 
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Figure 10-9 Single-gene lysis systems in ssDNA and ssRNA phages. Lysis genes are shaded. Top: Map of MS2, the prototype 
group I ssRNA phage, with the L sequence displayed above the map. L sequences from the group I phage fr (3), the group II 
phage GA (61), and the Acinetobacter phage AP205 (69) are arrayed above MS2 L for comparison. Middle: Map of Qp, the 
prototype group III ssRNA phage (105). The regions of close sequence similarity between QP A 2 and MS2 A are hatched, and 
the locations of the por mutations, selected for plaque-formation on a rat host, are indicated by stars (14) (l.-N. Wang and R. 
Young, unpublished data). Bottom: Map of the circular ssDNA phage cpXI 74, shown linearized at the start of the A gene. The 
sequence of the E lysis protein is displayed below the map, along with an alignment of the ortholog E proteins from the 
related phages (top to bottom) a3, G4 and SI 3 (47, 121). 

active lytically; fluorescence microscopy shows clearly 
that the hybrid E protein is uniformly distributed in the 
membrane (W. D. Roof and R. Young, unpublished data). 

Early attempts at defining the target of E by isolating 
resistant host mutants were frustrated by the fact that 
knockout mutations in a nonessential host gene, slyD, 


abolished E-mediated lysis (73). SlyD is a peptidyl-prolyl 
cis-transferase isomerase (PPIase), orthologous to mamma¬ 
lian FKBP, the receptor for the immunosuppressant drugs 
FK506 and rapamycin (93). In the absence of SlyD. the E 
protein is very unstable, with a half-life of about 2 minutes 
(11). d)X174 successfully infects slyD cells but, instead of 
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Figure 10-10 Phage amurins inhibit different steps in cell wall synthesis. A simplified version of the cell wall synthesis 
pathway, adapted from Nanninga (78). The jagged line in the membrane represents undecaprenylphosphate. Abbreviations 
used are: G, GIcNAc; M, MurNAc; pep 3 , tripeptide; pep 5 , pentapeptide; PBPs, penicillin-binding proteins. Two different steps 
are specifically blocked by the <j>Xl74 and QP amurins (see text). 


lysing the host cell, accumulates cytoplasmically to very 
high levels (94). 4>X174 pos mutants (plates on slyD) were 
isolated which overcome the lysis block: the pos mutations 
are E mutations which increase the synthesis rate by about 
10-fold, although the instability in the absence of SlyD is 
retained (11). An Epos allele cloned in a plasmid expression 
vector was used to select lysis-resistant mutants which 
mapped to mrd¥, a conserved membrane-embedded enzyme 
which catalyzes the formation of the first lipid-linked step in 
peptidoglycan biosynthesis (figure 10-10). The mutations 
map to TMD5 and TMD9 of MraY and block the specific 
inhibition mediated in vivo and in vitro by E, which does 
not inhibit a related membrane-embedded enzyme, Rfe, 
that also uses undecaprenol phosphate as a recipient 
and a UDP-activated donor molecule (10, 12). The basis 
of the enzyme inhibition is unknown, but it is worth noting 
that the C 55 carrier is in very limited quantities in the 
membrane, so it is possible that the hydrophobic mem¬ 
brane domain of E competes for the carrier binding site. 
In any case, E is the first of a new class of lysis proteins, 
the amurins, which cause lysis by inhibiting a specific 
enzymatic step in murein synthesis. 

In terms of how E-mediated lysis might be regu¬ 
lated, there is no evidence for regulation of E expression in 
cj)X174, where all transcripts include E (39). Lysis by E, as 
with the fungal antibiotic mureidomycin, a specific MraY- 
inhibitor, requires cell division and appears to be effected 
by a failed septation event (24, 64). This is consistent 
with the finding that fts mutants at the nonpermissive 
temperature are resistant to E-mediated lysis (115). Thus 
it would appear that the infective cycle of <f>X174 is defined 
by the cell-cycle stage at which the infection begins. Unless 
there are unknown compensatory strategies, phages which 
infect early after septation should have a longer vegetative 
phase and higher burst than phages which infect late in 
infection. It has not been ruled out that SlyD may mediate 
interactions with host physiology and thus provide a conduit 
for information relevant to the timing of lysis: interestingly, 


overproduction of SlyD leads to an immediate and reversible 
block in septation (92). 

QP a 2 

Like its orthologue A of the Leviviridae, the A 2 protein of QP 
and other Alloleviviridae is a >40 kDa single-copy compo¬ 
nent of the virion, with multiple roles including binding 
to the male pilus, RNA-binding during capsid assembly, 
protection of the RNA 3' terminus, and penetration, with 
the RNA, into the cytoplasm of the host (105). Unlike MS2 
A, however, A 2 is also the sole lysis protein of the phage 
(114). The discovery that cj)X174 E was an amurin, 
and evidence associating A 2 with a blockage of cell wall 
synthesis, suggested that A 2 might be another inhibitor 
of the murein synthesis pathway (13, 14, 82). Unlike E, 
however, no soluble lipid-linked or soluble precursors 
accumulated during A 2 -mediated lysis, suggesting that 
an early step in the pathway was blocked, rat mutants (resis¬ 
tant to A-two) were isolated as survivors of induction of a 
cloned A 2 gene (13). Eight independent, spontaneous rat 
mutants have been obtained and all are identical, with a 
Leul38Gln change in MurA, the conserved enzyme catalyz¬ 
ing the committed step of murein biosynthesis (14). Efforts to 
demonstrate MurA inhibition with purified A 2 have been 
frustrated by the latter’s extreme insolubility even in 8 M 
urea, but, remarkably, high-titer suspensions of virions, 
each bearing a single copy of A 2 , cause inhibition of 
MurA, but not MurA™ 42 . The ability of the virion-mounted 
A 2 to inhibit MurA despite supersaturating concentrations 
of both substrates suggests the inhibition is noncompetitive 
or uncompetitive (14). The position of the rat mutation 
corresponds to an outside surface of the enzyme, far from 
the active site cleft. 

Attempts using in vitro mutagenesis to separate the lysis 
function from the other functions of A 2 were unsuccessful 
(114). Comparison of the MS2 A and 0(3 A 2 sequences are 
not informative as to a likely localization of a lysis domain 
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because the two sequences are so dramatically diver¬ 
gent. However, Q(3por (plates on rat) mutants map to the 
N-terminal 120 residues of A 2 , which is also one of the 
regions lacking similarity to A, suggesting that this 
portion of the protein constitutes the amurin domain 
(figure 10-9) (14). 

The presence of the MurA inhibitor domain on the sur¬ 
face of the virion provides an obvious lysis control mecha¬ 
nism; when virions accumulate to sufficient levels to block 
the available MurA, cell wall synthesis stops and lysis 
ensues, if the cells are in growth phase. This link between 
virion content and lysis is broken, however, if cells are no 
longer dividing; interestingly, grossly higher virion yields 
can be obtained if late logarithmic cultures are infected and 
incubated until saturation (105). 

MS2 L 

The MS2 L gene is embedded across the end of the coat and 
beginning of the rep genes of the Leviviridae (figure 10-9). 
The local RNA secondary structure and the dynamics of 
ribosome movement in the immediate region of the start 
codon are critical for proper expression of the lysis gene, 
which has its translational initiation region occluded by a 
large secondary structure. The importance of this structure 
for the biological fitness of the phage, presumably because 
it is required to avoid inappropriately high synthesis of 
the L protein and thus the premature onset of lysis, has 
been shown in elegant suppressor studies by van Duin’s 
group, who eliminated the secondary structure by a series 
of wobble-position changes within the coat gene (70). Serial 
passage of the mutated phage resulted in the accumulation 
of multiple suppressor mutations, again in wobble posi¬ 
tions, that restored the secondary structure repressing L 
translation. 

The predicted L product is virtually a mirror-image of E, 
in that its 75 residues are primarily hydrophilic and cationic 
in the first two thirds of its predicted sequence, and mostly 
hydrophobic in the last third (figure 10-9). Deletion analysis 
suggests that its lytic capacity resides in the 32 C-terminal 
residues (9). Unlike E and A 2 , induction of a cloned L 
gene does not lead to an inhibition of incorporation 
of 3 H-diaminopimelate into murein before lysis (T. G. 
Bernhardt and R. Young, unpublished data). Thus, if L is 
also an amurin, it must affect a step beyond the first cova¬ 
lent attachment of the precursor disaccharide-pentapeptide 
into the murein layer. The L protein has been reported 
to be localized to zones of adhesion, to require the host 
membrane-derived oligosaccharide system for its lytic 
function, and to induce the host autolytic system (59, 
106, 107). A synthetic polypeptide corresponding to its 
C-terminal 25 residues has been shown to permeabilize 
liposomes to fluorescent dyes (48). Other than the deletion 
analysis described above, no mutational study of L has 
been reported, nor has there been a genetic analysis to find 


host mutants with altered L-sensitivity. In the absence of 
such studies, it is unclear whether these diverse phenomena 
are directly related to the lytic function of L. 

Perspectives and Directions 

In recent years remarkable strides have been made in 
our understanding of the molecular basis of phage lysis 
and its regulation. The progress has raised many new ques¬ 
tions. That there are at least two general strategies for lysis 
is clear. All complex phages seem to use holin-endolysin 
lysis, whereas two of the prototype small ssDNA and ssRNA 
phages encode proteins which act as inhibitors of cell wall 
synthesis. The diversity of holins has always been a stun¬ 
ning, and somewhat daunting feature of lysis, suggesting 
that there may be several fundamentally different lysis 
mechanisms. Nevertheless, the basic features of all these 
systems are still comfortingly common, especially the ability 
to be triggered by energy poisons. It will be useful to 
assess whether class II and class III holins share the “kill 
without warning” property of the S holin, as they should 
if our understanding of what drives holin evolution is 
correct. In addition, although the discovery of the secretory 
endolysins means that the muralytic enzymes can no longer 
be regarded as “dumb” reporter functions for the activity 
of holins, it has perhaps further brought into focus how 
important the timing function is for the lytic event that 
terminates all dsDNA phage infections. Holins apparently 
have evolved, perhaps independently and at multiple times, 
to provide a temporal schedule for lysis and to allow the opti¬ 
mization of that schedule. Competition experiments to test 
the fitness of various holin genes and series of timing- 
mutant alleles of a particular holin should be instructive. 
In terms of the mechanism of holin function, it is clear that 
the next step must include cytological localization of the 
holins to assess whether patches or rafts are formed in the 
membrane during the vegetative cycle. In vitro experiments 
in which holin function is reconstituted in artificial lipid 
vesicles should help clarify the roles of concentration, 
oligomerization, and membrane energization in the hole- 
formation process, as well as providing structural insight 
into the nature of the lesions. These same systems can also 
be exploited for investigating the mechanisms by which 
antiholins block holin function. The diversity of topologies 
available to the known antiholins suggest that a wealth of 
specific interactions underlie the regulatory properties of 
these molecules, interactions which should be fertile areas 
for genetic analysis. 

With regard to the single-gene systems, it will be inter¬ 
esting to see whether MS2 L is also an amurin, or if a 
third general strategy is available, perhaps a “magic button” 
that allows the induction of so-called autolysis without 
disturbance of murein synthesis. It should be noted that 
a wide variety of ssRNA phages have been isolated against 
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a number of different R factors (23). It is a reasonable expec¬ 
tation that some of these ssRNA phages may have evolved 
amurins against targets other than MurA. Recently, MH2K, 
a lytic Microvirus of Bdellovibrio, was isolated and sequen¬ 
ced. MH2K lacks a scaffolding protein gene equivalent to D 
(25) and, consequently, lacks the E lysis cistron embedded 
in D in <J)X174 and its coliphage relatives. Instead, the 
candidate lysis genes are short open reading frames embed¬ 
ded in other essential MH2K genes. It will be interesting 
to test whether this independently evolving lysis gene 
also targets MraY. Interestingly, a Microvirus sequence 
for the wall-less intracellular bacterium Chlamydia has no 
obvious reading frame available for a lysis gene (103). It is 
unknown how a phage can cause lysis of a host cell that 
grows, without a cell wall, in the iso-osmotic environment 
of a mammalian cell cytoplasm. 
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S ometime in the 1920s, a plaque was placed in a test tube 
labeled #174. It must have had a certain luster. Since 
then that phage has challenged genetic, biochemical, and 
biophysical dogmas. From the very beginning, it was 
unique. It was unusually tiny, readily passing through the 
smallest of ultrafilters. Sertic and Bulgakov (111) used its 
size to define a “race”: race X (ten). Hence the name became 
4)X174. The first electron micrographs revealed a small, iso¬ 
metric particle without an elaborate, and to some unseemly, 
tail structure. As the study of molecular biology progres¬ 
sed, the particle refused to behave. Its genome was single- 
stranded DNA (117), not a double stranded molecule. 
Although the data used to draw the first genetic maps were 
accurate, revealing the existence of the overlapping reading 
frames, the cistrons were arranged in an orderly and linear 
manner (12). However, when the genome was sequenced 
(1, 107) the raw data were proved correct. This complex 
genetic arrangement was so unexpected that many sus¬ 
pected an extraterrestrial origin. As the New York Times 
reported the theory (127), an advanced race engineered 
4)X174 and disseminated it into the cosmos where it would 
“persist until the evolution of intelligent life and finally of 
investigators interested in the genetics of phage.” Attempts 
were made to find the hidden message in the genome, a 
holy truth or grail, maybe directions to an inhabited planet; 
however, none was uncovered. Perhaps the only aspect of 
4)X174 that lends itself to science fiction is its name. 

Since the last edition of The Bacteriophages, the results 
of (j)X174 and Microviridae research have continued to chal¬ 
lenge paradigms. The atomic structures of the <f>X174 virion 
and grotesquely beautiful procapsid have been solved (35, 36, 
92, 93, 94). As the genome sequence challenged genetic 
dogmas, the structures of the external scaffolding protein 
have challenged fundamental ideas of protein folding and 
particle morphogenesis. Furthermore, the results of bio¬ 
physical studies suggest a scaffolding-like function for the 
genome (56) during the final stages of morphogenesis. 
Unlike large dsDNA viral morphogenesis, these final stages 


involve a radial collapse of the procapsid to form the virion, 
as opposed to an expansion. The characterization of a new 
Microviridae subfamily (20, 88,104,126) has offered insights 
into the mechanisms driving icosahedral single-stranded 
DNA viral evolution, which appear to be fundamentally 
different from those driving the evolution of the double- 
stranded DNA viruses (66, 67). And finally, studies of 
phage-mediated lysis mechanisms demonstrate how the 
Microviridae lyse cells without using lysozymes, but by 
“antibiotic-like” proteins (15,16,106). Quite frankly, 4>X174 is 
a rogue; however, to those who work with the enfant terrible, 
its recalcitrant nature is also its charm. 

This chapter will concentrate on research conducted 
since the last edition of this book. Therefore, topics such has 
host cell recognition, genome penetration, gene expression, 
and DNA replication, which have not been active areas 
research in the last decade, will only be briefly summarized. 
Readers wishing for a more comprehensive review of these 
topics should consult the chapter by Masaki Hayashi (62) 
in the previous edition of this book. 

Host Cell Recognition and Penetration 

<j)X174 attaches to host cells via a sugar molecule in lipopo- 
lysaccharides (LPS) of Gram-negative bacteria (47, 73, 76, 79, 
80). Bacterial strains that do not produce LPS containing 
specific terminal glucose and galactose molecules are resis¬ 
tant to 4>X174 infection (47). The E. coli gene rfab, located in 
the rfa cluster at 81 minutes, most likely encodes the requi¬ 
site LPS biosynthetic enzyme (109). In solving the atomic 
structure of the <f>X174 virion, the crystallized particles 
were purified in sucrose gradients (92-94). As a result of 
this technique, glucose molecules were visualized bound to 
the coat protein in a defined position. The genes and gene 
products of 4>X174 are given in table 11-1. The six residues 
that constitute this binding site are strongly conserved in 
the other coliphage Microviridae (52, 83). More sophisticated 
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Table 11-1 The c|)X174 Gene Products 
Protein Function 


A Stage II and stage III DNA replication 

A* An unessential protein for viral propagation. It may play a role in the inhibition of host cell DNA replication 

and superinfection exclusion 

B Internal scaffolding protein, required for procapsid morphogenesis and the assembly of early morphogenetic 

intermediates. Sixty copies present in the procapsid 

C Facilitates the switch from stage II to stage III DNA replication. Required for stage III DNA synthesis 

D External scaffolding protein, required for procapsid morphogenesis. Two hundred and forty copies present 

in the procapsid 

E Host cell lysis 

F Major coat protein. Sixty copies present in the virion and procapsid 

G Major spike protein. Sixty copies present in the virion and procapsid 

H DNA pilot protein need for DNA injection, also called the minor spike protein. Twelve copies in the 

procapsid and virion 

J DNA binding protein, needed for DNA packaging. Sixty copies present in the virion 

K An unessential protein for viral propagation. It may play a role optimizing burst sizes in various hosts 


crystallographic studies have elucidated the molecular 
basis of Ca 2+ dependence for host cell recognition (73). 
Upon soaking the ion into the crystal, the atoms bind to 
defined locations at the 3-fold axes of symmetry. This 
causes the amino acid side chains of the glucose binding 
site residues to assume an ordered conformation. 

The interaction between the glucose binding site and 
the LPS most likely accounts for the reversible reaction 
observed in kinetic experiments (75), which is followed by 
an irreversible reaction (77) the molecular basis of which 
remains obscure. DNA ejection can be simulated in vitro 
by high ionic conditions (138). Viral DNA is extruded from 
5-fold vertices. In addition, (j>X174 host range mutations 
confer amino acid changes in the 5-fold related spike 
proteins G and H (98, 118, 136). LPS binding by these 
proteins has also been reported (78, 82). Host range coat 
protein mutations do not map to the strongly conserved 
glucose-binding site, but to a series of amino acids which 
trace the G protein around the 5-fold axes of symmetry (31). 
These results argue for the existence of a second host cell 
receptor, perhaps required for penetration. Although the 
identity of this second factor is unknown, a gene required 
for 4>X174 sensitivity (phxB) has been defined and mapped 
to minute 17 of the E. coli chromosome (96). In the E. coli 
genome sequence (18) there are several genes which could 
encode this second receptor or be involved in its synthesis. 
The tail-less <f>X174 penetration process may not differ 
significantly from that of the large-tailed bacteriophage 
that "walk” along the surface of the cell, via tailspike interac¬ 
tions, until they find a receptor for penetration. Instead of 
walking, (j>X174 may simply roll. 

Electron micrographs of <j)X174 infections show the vast 
majority of adsorbed particles are embedded at points of 
adhesion between the cell wall and inner membrane (11, 21), 
suggesting the location of the hypothesized second receptor 
and indicating that the genome may be ejected directly 
into the cytoplasm. Penetration requires the viral H protein, 


also called the DNA pilot protein, which also enters into the 
cell (79, 80). In the atomic structure of the 4>X174 virion 
(73, 93, 94) diffuse density is located at each 5-fold vertex 
and has been attributed to protein H. A genetic analysis of a 
cold-sensitive H protein defective in DNA ejection supports 
this hypothesis (J. Oberste and B. A. Fane, unpublished 
data). Second-site suppressors of this allele map to two 
places in the major spike protein. One set of suppressing 
residues lines the channel that passes through each 5-fold 
vertex. The affected amino acids participate in 5-fold related 
contacts, hence maintaining the channel’s integrity. The 
second set alters the interface between (3-strands B and I 
of the G protein (3-barrel, suggesting that conformational 
switches within this interface may mediate the opening of 
the channel. 


DNA Replication 

Positive polarity single-stranded DNA replication strategies 
are complex, occurring in three separate stages. These 
processes will be briefly summarized: a more complete dis¬ 
cussion can be found in the previous edition of this book 
(62). Stage I DNA replication involves the conversion of the 
single-stranded genome into a covalently closed, double- 
stranded, circular molecule, called RF I DNA (replicative 
form one DNA). Purified single-stranded Microviridae DNA 
produces progeny in transfection experiments. Therefore, 
host cell proteins are both necessary and sufficient for 
stage I replication in vivo. The entire reaction has been 
reconstituted in vitro and requires 13 host proteins (84, 85, 
114). A stem-loop structure in the FG intercistronic region 
serves as the n' protein recognition site (113), and initiates 
the assembly of the primosome. As with most DNA priming 
structures, the stem-loop is not destabilized by single- 
stranded DNA binding protein (ssb). Additional proteins add 
to the complex in successive steps (6,7, 89). After primosome 
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assembly, the complex migrates along the single-stranded 
DNA in a 5'—>-3' direction (5) synthesizing RNA primers 
for DNA replication. Addition of the holoenzyme leads to 
chain elongation. 

During stage II DNA synthesis, RF I DNA is amplified. 
In addition to the 13 host cell proteins required for stage I 
replication, stage II replication is dependent on the viral A 
protein and the host cell helicase, rep protein (41). Replica¬ 
tion of the (+) strand proceeds through a rolling circle 
mechanism (39). The viral A protein binds to the origin of 
replication, a 30-nucleotide sequence between residues 
4299 and 4328, in the RF II DNA (10) and nicks it to initiate 
(+) strand synthesis (41, 135). The origin is separated into 
three functional domains, a binding domain, a cleavage 
recognition site, and an AT spacer region. The AT spacer 
region, which is located between the other two domains, is 
believed to facilitate helix unwinding. After nicking, protein 
A forms a covalent ester bond with the DNA (40). The host 
cell rep protein unwinds the helix and strand separation 
is stabilized by ssb (110). After one round of rolling circle 
synthesis, it cuts the newly generated origin and acts as a 
ligase, generating a covalently closed circular molecule 
(22, 39). Minus strand synthesis is mechanistically similar 
to stage I DNA synthesis. 

Stage III DNA synthesis involves the concurrent synth¬ 
esis and packaging of the single-stranded DNA genome. 
Viral procapsids and protein C are required for this reaction 
(61,95). The viral C protein associates with proteins A and rep 
on the RF II DNA. The C protein may also serve as an inhibi¬ 
tor of double-stranded DNA synthesis (4). Aoyama and 
Hayashi (4) proposed a model in which ssb and protein C 
compete to bind single stranded DNA located in the protein 
A-rep-RF II complex. If ssb binds first, another round of 
stage II DNA synthesis occurs. Conversely, if protein C 
binds, stage III DNA and packaging ensues. This competi¬ 
tion, however, can only occur at the initiation or reinitiation 
of DNA synthesis. Protein C will not inhibit stage II DNA 
synthesis after it has begun. 

The protein A-C-rep-RF II DNA complex must dock to 
a viral procapsid for stage III DNA synthesis to commence. 
Procapsid binding does not occur in the absence of protein C 
(4). Second-site suppressors of mutant rep proteins, which 
abrogate procapsid binding, map to the A and viral coat 
proteins (43, 128, 133). The coat protein suppressors are 
clustered along the 2-fold axis of symmetry in the atomic 
structure of the virion (35, 36, 92-94), suggesting a location 
for the docking site. In addition, procapsids containing 
mutant external scaffolding proteins, which are also located 
at the 2-fold axes of symmetry, cannot be filled (see below). 
Again, packaging can be restored with mutations in protein 
A. Mechanistically stage III DNA synthesis is similar to 
the stage II (+) strand synthesis. Protein A nicks the origin 
(134) and forms a covalent ester bond with the DNA (40). 
After one round of rolling circle synthesis, it cuts the newly 
generated origin and acts as a ligase, generating a covalently 


closed circular molecule (22, 39). The packaging reaction 
is now complete. Even if the packaged genome has been 
altered to be less than unit length, reinitiation, as seen 
in headful packaging systems (87), does not occur (3). 
The origin is both necessary and sufficient for packaging 
specificity (3, 49). Any circular DNA molecule with Micro¬ 
viridae origin of replication can serve as a template (3, 56). 
The participation of the DNA-binding protein in packaging 
is discussed below. 


Gene Expression 

Since Microviridae contain single-stranded genomes of posi¬ 
tive polarity, the negative strand must be synthesized before 
any genes can be transcribed and proteins translated. Unlike 
larger double-stranded DNA bacteriophages, Microviridae 
do not utilize trans-acting mechanisms to ensure temporal 
gene expression. Therefore, the timing and relative produc¬ 
tion of viral proteins is entirely dependent on cis-acting 
regulation signals: promoters, transcription terminators, 
mRNA stability sequences, and ribosome binding sites. 
Promoters are found upstream of genes A, B, and D (8, 9, 
107, 119, 120-122), and terminators are found after genes 
J, F, G, and H (figure 11-1). The terminators are not 100% 
efficient, which leads to a wide variety of transcripts of vary¬ 
ing lengths. However, there is a rough correlation between 
the abundance of a gene transcript and the amount of the 



Figure 11-1 The genetic map of 4>X174. The promoters and 
transcription terminators are indicated on the linear map of 
4>X174. Line thickness indicates the relative abundance of 
the transcripts. The gene A transcript is very unstable; the 
terminator for this transcript is unknown. Adapted from 
Hayashi et al. (62). 
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Figure 11-2 Phage (1X174 morphogenesis. 


encoded protein required for the viral life cycle (63). For 
example, gene D transcripts are the most abundant in the 
cell and protein D is the most abundant protein in the 
procapsid. Similarly, there are more gene F, J, and G tran¬ 
scripts than transcripts of gene H. The relative stoichiometry 
of these structural proteins is 5:5:5:1, respectively. Protein 
expression is also affected by the stability of the various 
mRNAs. Stability appears to be a function of 3' end of the 
message (65) and each mRNA species decays with a charac¬ 
teristic rate (64). Transcripts of gene A decay very quickly 
(65), which may ensure that this nonstructural protein 
is not overexpressed. And finally, regulation can also be 
achieved at the translational level. Despite gene E’s loca¬ 
tion within gene D, the most abundant transcript, few 
E proteins, which mediate cell lysis, are translated due to a 
weak ribosome binding site (17). 


The Morphogenesis and Structures of the 
Virion and Procapsid 

During the last decade, the atomic structure of several 
Microviridae virions and the 4>X174 procapsid have been 
solved (35, 36, 92-94). This, combined with well-established 
genetics and biochemistry, has made the (j>X174 system one 
of the most powerful for studying morphogenesis. While 
crystal structures provide a wealth of information, allowing 
the results of genetic experiments to be interpreted within 
a structural context, data are limited to the particle crys¬ 
tallized. Transient or less stable interactions between pro¬ 
teins are not always apparent (26, 35, 36, 72), but can 
be elucidated by the results of genetic and biochemical 
studies. 


Pre-scaffolding Stages 

As illustrated in figure 11-2, the first c()X174 morphogenetic 
intermediates are the 9S and 6S particles, respective penta- 
mers of the viral coat and spike proteins. These particles 
most likely self-assemble without the aid of scaffolding 
or host cell proteins such as groEL and ES (51, 68, 124). 
Chaperone-independence distinguishes the c|)X174 coat 
protein from those of larger bacteriophage (34, 53, 59, 97). 
The groE genes were first defined as host cell mutants that 
failed to support the growth of several double-stranded 
DNA bacteriophages (124). Although extensive searches 
have been conducted with 4>X174, mutations in molecular 
chaperones have never been recovered. However, mutations 
in other genes, such as the rep helicase, are abundant (128: 
M. Hayashi, personal communication). 

The Internal Scaffolding Protein 

In cells infected with nonsense or temperature-sensitive 
alleles of the internal scaffolding protein, protein B, 9S and 
6 S particles accumulate (42, 115, 132), demonstrating that 
pentamer formation is not a function of the first scaffolding- 
mediated steps of morphogenesis. Furthermore, the atomic 
structures of the Microviridae capsids reveal extensive 5-fold 
related contacts, suggesting a self-assembly mechanism 
(35, 36, 92-94). Pentamers formed in the absence of func¬ 
tional scaffolding proteins do not differ biologically from 
pentamers formed in its presence. Upon shifts to permissive 
temperatures in tsB- and csB-infected cells, 9S and 6S 
particles are efficiently chased into virions (42,115). 

After 9S pentamer formation, the internal scaffolding 
protein binds to the pentamers underside and induces a 
conformational change in the particle. This change inhibits 
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premature aggregation (115), and produces an assembly- 
competent state. B protein binding is both necessary and 
sufficient to allow interactions with the external scaffolding 
and major spike proteins (42,132). Elucidating the structural 
changes of the conformational switch has proved difficult. 
Ideally one would compare the atomic structures of a naive 
pentamer with a pentamer in the procapsid. However, 
assembly-naive 9S particles aggregate in vitro, complicating 
crystal formation (R. McKenna, personal communication). 
Nevertheless, the results of second-site genetic analyses of 
a cold-sensitive B allele have offered some insights into the 
nature of the conformational switch (42, 45) and indicate 
that the morphogenetic changes occur on the outer surface 
of the coat protein. 

Although morphogenesis does not continue past the first 
B-protein-mediated reaction in cells infected with this cold- 
sensitive B mutant, two lines of evidence suggest that the 
mutant protein retains some level of function, indicating 
a defect in conformational switching. First, 9S particles 
do not aggregate in vivo, suggesting that the csB protein 
can still inhibit the aggregation of coat protein pentamers. 
Second, the mutant is rescued by substitutions located on 
the outer surface of the coat protein, not the scaffolding- 
coat protein interface. The mutations are located within 
three distinct sequences of considerable homology, all found 
in loop regions of the protein, as opposed to the |3-barrel core. 
These sequences may play a key role, perhaps as hinges, in 
mediating pentamer conformational switches. Coat protein 
mutations affecting other stages of morphogenesis— 
external scaffolding protein interactions, packaging com¬ 
plex recognition, B protein specificity, and provirion to 
virion transition—have also been isolated (28, 43, 46, 56, 
81). All these mutations are found within the loop regions 
of the atomic structure, suggesting that, once folded, the 
contribution of the (3-barrel core to morphogenesis is 
minimal. 

Genetic analyses of the internal scaffolding protein have 
been impeded by a dearth of scaffolding protein missense 
mutations that confer morphogenetic defects. The dearth of 
such mutations in phage systems is rare. Although muta¬ 
tions in the gene B reading frame can also produce muta¬ 
tions in the gene A—proteins A* and K are nonessential 
(30, 129)—the results of genetic analyses indicate that 
genome organization is not responsible for this phenom¬ 
enon. The B gene was cloned and mutagenized in a plasmid, 
thus removing any selective pressures that may be enforced 
by the overlapping reading frames. After mutagenesis, and 
reintroduction of the mutagenized plasmids into host cells, 
clones were screened for loss of the ability to complement 
amB phage. Although many noncomplementing B genes 
were recovered, all the mutations were nonsense or frame- 
shift mutations. Furthermore, the mutagenized clones were 
screened for the ability to inhibit wild-type plaque forma¬ 
tion, an assay for dominant lethal gene products. None 
were observed. The results of these experiments suggest 


that the lack of morphogenetic missense mutations is the 
result of a highly tolerant protein structure. 

To test this hypothesis, the internal scaffolding proteins 
of the related bacteriophages u.3 and G4 were cloned, 
expressed in vivo and assayed for the ability to cross¬ 
complement phages <f>X174, G4, and a 3 nullB mutants. 
Despite the low sequence homology (30%) the proteins 
were, with one exception (see below), capable of cross¬ 
complementation, yielding not only procapsids but mature 
virions (28). However, in all cases morphogenesis was 
more efficient when directed by the indigenous protein. In 
essence, the “foreign” scaffolding proteins can be regarded 
as a “multiple mutant.” If a scaffolding protein in which 
70% of the amino acids are altered is functional, the 
difficulty in obtaining defective proteins with single amino 
acid substitutions becomes apparent. These results also 
indicate that scaffolding proteins are inherently flexible. 
Experiments conducted with the analogous bovine herpes¬ 
virus 1 and herpes simplex virus (HSV) proteins yielded 
similar results (54), suggesting that flexibility may be a 
general property of internal scaffolding proteins. Consider¬ 
ing the dynamics of viral assembly, some inherent flexibi¬ 
lity is probably required. Internal scaffolding proteins 
must first assume a structure that directs the assembly of 
coat proteins into a rigid capsid. Afterwards, these proteins 
must assume an alternative structure, compact enough to 
be extruded through 2-3 nm pores, as observed in the 
(j)X174 and P22 systems (72,102). 

Evidence for inherent flexibility is also revealed in the 
<f>X174 procapsid structure. In the procapsid, the internal 
scaffolding protein binds to a cleft formed between a-helix 2 
and the (3-barrel of the coat protein (35, 36, 92). Coat protein 
binding is mediated by the last 24 amino acids of the scaf¬ 
folding protein, which is the only portion of the proteins 
used in the cross-complementation studies that exhibit a 
high degree of sequence conservation. These interactions 
are primarily aromatic and comprise the most intimate 
B-coat protein contacts. The C-terminus is also the most 
ordered part of the protein. The first 60 amino acids of 
the protein yields primarily diffuse density, suggesting that 
interactions are variable and/or nonspecific in nature. In 
addition, the N-termini of the cross-functional scaffolding 
proteins are highly divergent. Therefore, it is unlikely that 
coat/N-terminal scaffolding interactions are governed by 
specific side chains. Again, structural studies with P22 and 
herpesvirus procapsids have yielded similar results. In 
cryo-image reconstructions, the C-termini of the proteins 
appear to produce ordered density, where the protein is in 
close contact with the capsid protein (99.131,141). 

Data from three diverse systems—4>X174, P22, and 
HSV-1—suggest that the C-termini of internal scaffolding 
proteins play a critical role in coat protein recognition. The 
importance of this region in the cj>X174 system is reinforced 
by both genetic and biochemical data. As stated above, 
the scaffolding proteins of the related phages G4, (j>X174, 
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and a3 are able to cross-complement (25,28). However, there 
was one instance in which cross-complementation was 
not observed. The <f>X174 protein cannot participate in the 
formation of the G4 procapsid. Characterization of the G4 
morphogenetic pathway under these conditions revealed 
an accumulation of major coat and spike protein pentamers, 
indicating that morphogenesis terminated before the first 
internal scaffolding protein mediated reaction. In addition, 
the <j)X174 B protein does not inhibit wild-type G4 morpho¬ 
genesis. These data suggest that the <j>X174 B and G4 coat 
proteins cannot interact. However, two G4 mutants (4>XB- 
utilizers ) that productively utilize the <j>X174 B protein were 
isolated. These mutations confer substitutions in the G4 coat 
protein that contact the C-terminal half of the B protein. In 
both instances, the substitutions create local coat protein 
sequences that are more <j)X174-Iike. One of these substitu¬ 
tions is located directly within the aromatic B protein¬ 
binding cleft. It confers a Ser —»• Phe substitution and most 
likely reflects the importance of aromatic interactions in 
coat recognition. The importance of C-termini interactions 
was further investigated by constructing chimeric B genes 
(25). The <j)XG4 B protein complements G4 nullB mutants, 
demonstrating that the inability of the <f>X174 B protein to 
interact with the G4 coat protein is a function of the 
C-terminus. In addition, when the C-terminus of any 
chimeric scaffolding protein was of the same origin as the 
viral coat protein, complementation was the most efficient. 

For the most part, the function(s) of the N-termini 
remains obscure. The procapsid atomic structure suggests 
that the first 10 amino acids form an a-helix that self¬ 
associates across the 2-fold axis of symmetry. However, 
proteins lacking this a-helix function like the wild-type 
(25). This deletion may be tolerated because the external 
scaffolding protein also stabilizes the 2-fold axes of symme¬ 
try (35, 36). Cryo-image reconstructions of the a3 procapsid 
suggest that a large portion of the internal scaffolding 
protein, approximately residues 15-60, may be located at 
the 5-fold axes of symmetry, perhaps associated with the 
DNA pilot protein (14). A series of larger N-terminal deletion 
proteins have been constructed (100). The elimination of the 
first 38 amino acids supports large particle formation but 
terminates morphogenesis before virion production. The 
defective particles contain a full complement of coat and 
spike proteins but a drastically reduced amount of the 
H DNA pilot protein. These results support the hypothesis 
that a portion of the B protein may be located at the 5-fold 
axes of symmetry and indicate that the internal scaf¬ 
fold protein may play a role in H protein incorporation. 
Oddly, B proteins lacking the first 53 amino acids support 
virion production at temperatures above 33°C. Below 33°C, 
the mutant protein cannot support procapsid forma¬ 
tion. Mutants that can utilize this deletion protein at lower 
temperatures have been isolated. The mutations map to 
dimerization domains in the external scaffolding protein. 
Perhaps a stronger external lattice can compensate for 


internal scaffolding proteins which cannot fully induce 
the conformational changes in coat pentamers needed to 
form capsid curvature. 

The exact identity of the assembly intermediate between 
the 9S and 6S particles and the procapsid remains some¬ 
what obscure. Unlike most viruses 4>X174 morphogenesis 
also relies on an external scaffolding protein (protein D) 
in order to assemble a pentameric intermediate into the 
procapsid. However, prior interaction with the internal scaf¬ 
folding protein is required for subsequent coat protein inter¬ 
actions with the external scaffolding and spike proteins 
(115). In the absence of a functional D protein, 12 S particles 
accumulate (132). These particles appear to have 5-fold rota¬ 
tional symmetry (62). Three chemically distinct 12 S particles 
have been isolated (42, 45, 62); all three contain the F and G 
proteins. They differ in the incorporation of the DNA pilot 
and internal scaffolding proteins. The presence or absence 
of the B protein, a substrate for the ompT protease, is an 
artifact of purification procedures (105). However, ompT 
cleavage is not required for morphogenesis (32). Although 
the incorporation of the H protein remains obscure, its 
absence does not affect the formation of capsid-like struc¬ 
tures in vivo (123). 

The 12 S particle exhibits the biochemical properties 
traditionally associated with morphogenetic intermediates: 
the ability to be chased into large particles in temperature- 
shift experiments with tsD mutants. However, it is not clear 
whether this particle represents a true morphogenetic inter¬ 
mediate or the product of an off-pathway but reversible reac¬ 
tion (62). The procapsid atomic structure supports the latter 
possibility. However, the particle matured during crystalli¬ 
zation and most likely does not represent the biologically 
significant intermediate. There are apparently few contacts 
made between the F and G proteins within the structure. 
The spike pentamers are tethered to the underlying capsid 
proteins via the external scaffolding protein, which is not 
a component of the 12 S particle. Yet the 12 S particle formed 
in tsD infections is most likely not the degradation product 
of a fully formed procapsid. Such degradation products lose 
the ability to re-enter the morphogenetic pathway. For exam¬ 
ple, in cells infected with csD alleles at restrictive temp- 
ecratures, fragile procapsids are produced. During DNA 
packaging, these procapsids dissociate into 12S-like particles 
that cannot be chased into larger structures (42). Then what 
is the true assembly intermediate? The atomic structure 
suggests that it should contain not only the internal scaf¬ 
folding, major spike, and capsid proteins but also 20 copies 
of the external scaffolding protein. This may be the fleeting 
18S particle (28,45). 

The External Scaffolding Protein 

The <j>X174 external scaffolding protein (protein D) per¬ 
forms many of the functions typically associated with 
internal species in systems with one scaffolding protein: the 
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Figure 11-3 The structures of the external scaffolding 
protein. 

organization of assembly precursors into a procapsid and 
the stabilization of that structure. However, its function is 
physically and temporally dependent on the internal scaf¬ 
folding protein, which induces the conformational changes 
in capsid pentamers. In the procapsid crystal structure, 20 D 
proteins are associated with each pentameric capsomer 
(35, 36). Remarkably, there is little or no contact between 
capsid pentamers. The structure is held together primarily 
by 2-fold related contacts between D proteins. The four 
D subunits (Dl, D2, D3, and D4) per asymmetric unit 
(figure 11-3) are arranged as two similar, but not identical, 
asymmetric dimers (D1D2 and D3D4). These dimeric sub- 
assemblies may be a component of the tetrameric structures 
seen in solution (R. McKenna, personal communication). 
However, these particles have closed point group symmetry, 
a significant departure from the appearance of the D sub¬ 
units in the asymmetric unit. It has not been determined 
whether these tetrads consist of four or eight proteins. 
These particles sediment at 4S and behave like assembly 
intermediates in pulse-chase experiments (132). 

The “canonical monomer” in the crystal structure is 
composed of seven a-helices separated by loop regions. 
However, there is considerable structural variation between 
the subunits, which bear no resemblance, not even an 
inkling, to quasi-equivalence. For example, a-helix #5' 
forms a |3 structure in subunit D2, where it participates in 
interdimer contacts with D3, and a helical structure in 
D3, where it participates in D4 intradimer contacts. In subu¬ 
nit D4, it mediates scaffolding contacts across the 2-fold 
axis of symmetry and contains no secondary structure, 
a-helix # 7 only forms in the D4 subunit, where it mediates 
the most extensive coat protein interactions found in the 
entire lattice. Structural data also suggest that this unique 
arrangement is mediated, in part, by glycine residue 61 in 


a-helix #3. One monomer in each dimer is bent 30° at this 
site. The kink may be needed to switch the second monomer 
into a non-sticky conformation. The importance of this resi¬ 
due is even emphasized in the structure of the genome. 
The glycine 61 and gene E start codons overlap in all the 
c|>X174-like phages (52, 83, 107). Hence there is strong 
selective pressure to maintain the codon. A limited number 
of site-directed mutants can be and have been generated at 
this site (27). The mutants can only be propagated in cells 
overexpressing the wild-type protein. In coinfections with 
wild-type ()>X174, the lethal phenotype is dominant, sug¬ 
gesting morphogenetic defects occurring after monomers 
interact to form dimers. 

While the atomic structure of the <f>X174 procapsid has 
yielded a wealth of information, it does not reveal all the 
contacts in which the external scaffolding protein partici¬ 
pates. During crystallization, the procapsid matured: the 
coat protein assumed a conformation similar to its struc¬ 
ture in the mature virion. There are two dramatic differences 
between the X-ray and the cryo-electron microscopy models. 
First, in the X-ray model, the coat protein has moved inward 
radially, away from the external scaffolding protein lattice. 
However, the coat protein has not fully dissociated from it. 
Second, a large coat protein a-helix occupies the 3-fold axes 
of symmetry, which in the cryo-electron microscopic image 
is free of density and thus contains a 3 nm pore. In addition, 
the interactions between the procapsid and the packaging 
machinery would not be visualized. The results of genetic 
experiments and experiments conducted with inhibitory 
cross-species and chimeric scaffolding proteins have eluci¬ 
dated some of these unseen contacts. The external scaffold¬ 
ing protein primary sequences of the 4>X174 and the related 
bacteriophage a3 are 70% conserved (83, 107). Divergent 
sequences are localized to the N- and C-termini of the 
proteins. These sequences constitute a-helices #1, #7, and 
loop #6 in the atomic structure. The (j>X174 and a3 external 
scaffolding genes have been cloned and expressed in vivo 
and do not cross-complement. However, expression of a 
foreign scaffolding protein blocks wild-type morphogenesis 
(26). The ability of foreign scaffolding proteins to inhibit 
morphogenesis is most likely due to the formation of cross¬ 
species dimers. The regions of the proteins responsible for 
dimerization (a-helices 2-6) are strongly conserved. 

To determine whether one or both termini confer inhibi¬ 
tory effects, chimeric genes, in which the first a-helices were 
interchanged, have been constructed and expressed in vivo. 
Expression inhibits morphogenesis in a somewhat species- 
specific manner. Efficient inhibition is governed by the iden¬ 
tity of the first a-helix. The chimera that contains a-helix #1 
from <j)X174, for example, strongly inhibits <f>X174 morpho¬ 
genesis; a3 morphogenesis is only modestly affected. The 
relative levels and species specificity of inhibition merit 
further explanation. While the phenomenon was sym¬ 
metrical—depending on which phage, <f>X174 or a3, was 
used in the assay—for clarity the discussion will focus 
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exclusively on 4>X174. The weak inhibition conferred by 
the cx3/4>X chimera is only achieved when the chimeric 
gene is maximally induced. Under these conditions, inhibi¬ 
tion in plating assays ranged from 1CP 1 to 1CP 3 . The strong 
inhibitory phenomenon is observed when the cloned <j)X/a3 
D gene is barely induced. Even under those conditions, plat¬ 
ing efficiencies drop below 10 ~ 6 . Although plating efficien¬ 
cies below 10“ are also achieved when expressing the 
wild-type foreign a 3 protein, the cloned gene must be maxi¬ 
mally induced. These data suggest a temporal mechanism in 
which the initial recognition of the coat protein is mediated 
by a-helix #1 of the external scaffolding protein. The 
presence of the proper first a-helix, of the same origin as the 
viral coat protein, facilitates the incorporation of the 
chimeric protein into external lattice, acting as a vehicle for 
the incorporation of inhibitory foreign loop #6 and a-helix 
#7 sequences. 

(j)X174 intermediates synthesized in cells expressing for¬ 
eign or the chimeric <j)X/a3 D proteins were analyzed by 
sucrose gradient sedimentation. In extracts generated from 
cells expressing the <f>X/a3 D chimera, which assays for 
defects conferred by foreign loop #6 and a-helix #7 struc¬ 
tures, procapsids and empty capsids were present; however, 
virions were not, indicating a block in DNA packaging. The 
docking of the replication/packaging machinery is most 
likely prevented by sequences found in the C-terminus of 
the a 3 protein. In addition, <j)X174 mutants resistant to the 
expression of <j)X/a3 D chimera have been isolated (cliiD R 
mutants). These mutations alter viral protein A, a compo¬ 
nent of the genome biosynthesis/packaging machinery, 
which binds the procapsid during DNA packaging, presum¬ 
ably at the 2-fold axis of symmetry (43). In extracts prepared 
from cells expressing the foreign protein, procapsids (108S) 
and empty capsids (70S) were not detected, suggesting 
either a block before procapsid formation or the production 
of unstable particles. This suggests that all the external 
scaffolding dimers were heterogeneous and some of them, 
keeping in mind that D proteins form asymmetric dimers, 
may not have been able to recognize the coat and/or spike 
proteins due to the presence of the foreign a-helix sequence 
(26). Interactions between a-helix #1 of the D1 subunit 
and the spike protein are visible in the X-ray model. The 
helix’s proximity to the 3-fold axis of symmetry in the D4 
subunit also suggests an interaction with the coat protein. 
However, this interaction cannot be observed in the atomic 
structure due to the above-mentioned maturation events 
that occurred during crystallization. 

In complementation experiments, neither the foreign 
loop #6 nor helix #7 chimeras, built directly into the viral 
genome, complemented the a-helix #1 chimera. This may 
indicate that a-helix #1 also plays a critical role in the 
D4 subunit. However, an element of uncertainty exists in 
intragenic complementation experiments, due to the lack of 
a positive control. Genetic data (42, 46) also suggest an 
unseen interaction occurs between D4 a-helix #1 and 


a-helix #4 of the viral coat protein. Both helices are found 
at the 3-fold axis of symmetry in the closed structure, which 
has matured during crystallization. Two point mutations in 
the first a-helix of protein D have been extensively character¬ 
ized (42, 46). These mutations confer a fragile procapsid 
phenotype. While procapsid morphogenesis is not inhi¬ 
bited, the particles disassociate during DNA packaging. In 
an open structure, the coat protein helix could be shifted 
upwards and may contact a-helix #1 of the D4 subunit of 
an adjacent asymmetric unit, which is the most closely asso¬ 
ciated subunit with the underlying coat protein. Both 
helices are amphipathic and could interact via hydrophobic 
interfaces. If this interaction is indeed present in the native 
structure, this may exclude dimers with a foreign a-helix #1 
from the D3D4 position. In addition, results of experiments 
conducted with chimeric G4/4>X174 external scaffolding 
proteins and mutant 4>X174 coat proteins capable of produc¬ 
tively utilizing the chimeric scaffolding also suggest interac¬ 
tions between the D proteins first a-helix and 3-fold related 
coat protein residues (B. A. Fane, unpublished data). 

To further dissect structure-function relationships in 
the (j>X174 external scaffolding protein, additional chimeras 
have been generated in the C-terminus of the protein. The 
chimeric genes were built directly into the phage genome 
and substitute either a3 loop #6 or helix #7 sequences 
into the 4>X174 gene. The helix #7 chimera can be pro¬ 
pagated in cells overexpressing the wild-type protein. 
However, in coinfections with wild-type <f>X174, the chimera 
confers a dominant lethal phenotype. The severity of 
dominant phenotype is greatly decreased in coinfections 
with the chiD R strain, indicating that the chiD R mutation 
confers resistance to the defects caused by the incorporation 
of this inhibitory region. The loop #6 chimera, on the other 
hand, is viable, but has a cold-sensitive (cs) phenotype. Both 
extragenic and intragenic revertants of the cs phenotype 
have been isolated (27). The intragenic mutation is in loop 
#6, conferring an E —*■ D substitution in the central residue 
of the loop, which is the amino acid found in the (j)X174 
sequence. From the atomic structure of the procapsid, it is 
known that this residue, but only in the D4 subunit, makes 
extensive contacts with a lysine 118 of the coat protein. 
However, <f>X174 morphogenesis is a dynamic system, and it 
is difficult to conceive successful evolution if morphogenesis 
is contingent on the identity of one amino acid. Hence the 
extragenic suppressors are likely of equal import. These 
mutations confer amino acid substitutions on the outer 
surface of the capsid protein, but they are not limited to the 
contacts made by the D4 subunit. This suggests that the D4 
position can be adjusted by neighboring D proteins within 
the lattice. The identification of these suppressors serves 
another important role: elucidating interactions lost in the 
X-ray model due to particle maturation within the crystal. 
This maturation includes the radial collapse of capsid pro¬ 
teins away from the external scaffolding lattice (14, 36). 
While the X-ray structure reveals coat protein contacts for 
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each D subunit, they are not extensive. The extragenic 
suppressors map to residues adjacent to the known contacts, 
indicating that the coat-external scaffolding interface in the 
native structure is more extensive than the X-ray model 
reveals. 

The relative ability of the chimeric proteins to inhibit 
morphogenesis suggests a temporal model for coat-external 
scaffolding protein recognition. The cloned a3/<j)X chimeric 
protein only weakly inhibits (j>X174 morphogenesis. In con¬ 
trast, the <j)X/a3 chimera is a potent inhibitor. These data 
suggest that chimeric protein incorporation into the lattice 
is a function of the first a-helix. Its proximity to the 3-fold 
axis of symmetry and genetically defined interactions 
with the viral coat protein (42, 46) suggest that this first 
substrate-specific interaction occurs between a dimer of 
scaffolding protein and the adjacent 5-fold related coat 
protein. Figure 11-4 illustrates the atomic structure of the 
3-fold axis of symmetry in the <j)X174 procapsid. In the 
native structure, this axis would not be occupied by the 
large coat protein helix. This helix may be restrained by the 
D4 subunit of the adjacent asymmetric unit. After coat¬ 
scaffolding recognition, loop #6 and a-helix #7 place the 
D4 subunit dimer atop the capsid. D1D2 dimers would then 
be added to the same asymmetric unit and the adjacent unit, 
mediated by 5-fold D4-D2 and D4-D1 interactions, respec¬ 
tively. In a chain reaction, dimers would add around the 
pentamer. The resulting intermediate would contain five 
copies of the spike, coat, and internal scaffolding proteins, 
one copy of the minor spike protein H, and 20 copies of 
protein D. A particle of this composition has been detected 
in vivo and it sediments at 18S (28, 45). However, a means 



Figure 11 -4 The 3-fold axes of symmetry of 4>X174 “closed” 
procapsid. In the native structure, the large coat helix does 
not sit occupy the 3-fold axes of symmetry, which contain 
pores. The results of genetic analyses suggest interactions 
between the first a-helix of the D4 subunit and the large coat 
protein helix. 


to genetically trap this intermediate in large quantities has 
not been established. 

In a kinetic model of capsid assembly, as described for P22 
(103), procapsid formation would be nucleated by a rate- 
limiting step, and hence a higher order reaction than those 
which follow. If post-nucleation morphogenesis involves 
the addition of one pentameric intermediate to a growing 
shell, the nucleation complex formation would require at 
least three pentameric intermediates. Since there are no 
coat-coat or 3-fold related scaffolding contacts in the 
procapsid atomic structure, the reaction is expected to be 
catalyzed by three sets of 2-fold related interactions. The 
involvement of the external scaffolding protein is easily 
visualized, due to the specific and ordered 2-fold related 
contacts in the crystal structure. The role, if any, of 
the internal scaffolding protein in procapsid nucleation 
remains more obscure. 

DNA Packaging and the DIMA 
Binding Protein 

Genome biosynthesis and packaging are concurrent pro¬ 
cesses. The pre-initiation complex, consisting of the host 
cell rep, viral A and C proteins, associates with the procapsid 
forming the 50S complex (95). The viral A protein binds 
the origin of replication in replicative-form DNA, as des¬ 
cribed above. This is both necessary and sufficient for pack¬ 
aging specificity (3, 49). The location of the pre-initiation 
docking site has been elucidated in a series of genetic 
experiments (43). c()X174 morphogenesis was examined in 
two hosts with mutations in the bacterial rep gene (128). 
Morphogenesis was blocked at the formation of the 50S com¬ 
plex. Stage II DNA synthesis, which also requires the bacte¬ 
rial rep protein, was not affected. Procapsids accumulate 
in these infected cells. A second-site genetic analysis was 
conducted. Second-site phage-encoded mutations were iso¬ 
lated in the genes encoding the viral coat and A proteins. 
The mutations within the coat protein cluster in the atomic 
structure of the virion, tracing a pronounced depression 
that skirts the 2-fold axis of symmetry. The location of 
these substitutions suggests that the depression may serve 
as the pre-initiation complex binding site during 50S parti¬ 
cle formation. Both genetic and structural data support 
this hypothesis. The second site suppressors are active in 
trans and confer dominant phenotypes. 

The DNA binding protein, protein J, enters the procapsid 
during packaging (50) and is absolutely required for genome 
encapsidation (57, 58). Furthermore it may dislodge the 
internal scaffolding protein from a common binding cleft in 
the coat protein (35,36,92-94). Protein J is a small, extremely 
basic protein, 37 amino acids in length and separated into 
four functional domains. It binds the genome via simple 
charge-charge interactions (81) mediated by two DNA bind¬ 
ing domains. These domains contain the 12 basic residues 
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of the protein and are separated by a short sequence rich 
in proline residues. Once in the procapsid, the C-terminus 
of the protein, which is very hydrophobic and aromatic, 
binds to a coat protein cleft (93, 94). This may facilitate 
further interactions between the genome and capsid, speci¬ 
fically with a small cluster of adjacent basic capsid amino 
acids (93, 94). 

Genetic studies with the 4>X174 J protein have demon¬ 
strated that the ability to bind and package DNA is directly 
related to the number of basic amino acid residues in the 
protein (56,81). Step-wise substitutions for the basic residues 
produce less functional proteins. The removal of fewer than 
three basic residues produces viable particles. However, the 
particles often have cs and ts phenotypes and very small 
plaque morphologies. Substitutions of three or four basic 
residues produce fully packaged particles that have lost 
infectivity. These mutants have dominant lethal pheno¬ 
types in coinfections with wild-type <f>X174. The dominant 
phenotype and the production of packaged particles 
indicate that the mutant proteins retain enough function 
to enter into the morphogenetic pathway. Substitutions of 
six basic amino acids confer recessive lethal phenotypes. 
However, the proteins retain a low level of function. Stage 
III DNA synthesis is completed and unit-length genomes 
are associated with capsids, but are not fully encapsidated, 
remaining sensitive to DNase (56). 

Genome-Capsid Interactions and the 
Final Collapse 

Unlike large dsDNA bacteriophages (38), the <j>X174 genome 
does not exist as a dense core in the capsid. Instead, the 
DNA binding protein and basic capsid amino acid residues 
tether it to the capsids inner surface. There are 60 copies 
of protein J per virion, one associated with each coat protein. 
In the atomic model, the protein forms an S-shaped poly¬ 
peptide chain devoid of secondary structure. The C-terminus 
of the protein is tightly associated with a cleft, located 
near the center of the coat protein. Moving toward the 
N-terminus, the protein traces a path toward the 5-fold axis 
of symmetry, crosses over to the adjacent capsid protein, and 
veers toward the C-terminus of the adjacent J protein. This 
motif suggests that the DNA binding protein guides the 
incoming genome into a somewhat ordered conforma¬ 
tion. Accordingly, a portion of the genome is ordered in the 
X-ray structure (93, 94). 

The biophysical characterization of fully packaged parti¬ 
cles with mutant DNA binding proteins suggest that the 
(j)X174 genome, more specifically its interactions with the 
DNA binding and capsid proteins, may also perform a mor¬ 
phogenetic function, mediating the final stages of morpho¬ 
genesis (56). These final stages involve the dissociation of 
the external scaffolding protein and a 0.85 nm radial 


collapse of capsid pentamers (14, 35, 36, 72). The mutant 
particles were significantly denser than wild-type, but the 
protein composition of the two particles appears to be iden¬ 
tical. Therefore, gross excesses of protein J within the volu- 
metrically fixed capsid do not cause the altered density. 
The effects of possible Cs + permeability were more difficult 
to discern. However, extragenic second-site suppressor pro¬ 
capsids packaged with the mutant DNA binding protein 
restored particle densities to near wild-type values. 
Therefore, a model in which counter-ions compensate for 
the loss of basic amino acids is not the basis of the density 
differences. Differences are also expressed on the capsid 
exterior. In host cell attachment assays, mutant particles 
exhibited substantially lower attachment efficiencies. 
Dramatic differences in native gel migration, which is a func¬ 
tion of size and net surface charge (112), were also observed. 
In all assays, the extragenic suppressor restored the proper¬ 
ties of the mutant particles to nearly those of wild-type 
virions. Naked <f>X174 DNA is substantially richer in second¬ 
ary structure than packaged DNA (13). Therefore, if an inter¬ 
play between base-pairing and DNA-capsid association 
occurs, altering the base composition of the genome should 
also produce altered particles. (j>X174 ampicillin transducing 
particles were generated by packaging single-stranded 
versions of unit-length plasmids. As with the particles pack¬ 
aged with the mutant DNA binding proteins, the transdu¬ 
cing particles exhibited different biophysical characteristics 
from wild-type 4>X174. 

The role of DNA-capsid interactions in <f>X174 is obvi¬ 
ously not as dramatic as those seen in other viral systems 
such as Flock House, Southern cowpea mosaic virus or 
Brome mosaic virus. In those systems abrogating genome- 
capsid interactions leads to either polymorphic particles or 
capsids with altered T values (37, 86, 108). In the procapsid 
there are no discernible pentamer-pentamer interactions. 
The integrity of the capsid appears to be maintained by 
the scaffolding proteins. After packaging, the internal scaf¬ 
folding protein is extruded from the structure and replaced 
by the DNA binding protein and the tethered genome. 
This may supplant scaffolding function in the provirion. 
The provirion-to-virion transition is marked by the release 
of the external scaffolding protein and the completion 
of the 0.85 nm radial collapse of coat protein pentamers. 
This tether constrains the spatial orientation and secon¬ 
dary structure of the remaining nucleotides (13). Therefore 
altering the tether or the base composition of the packaged 
nucleic acid may affect the magnitude or the integrity of 
the collapse. 

Lysis 

A more detailed discussion of bacteriophage lysis stra¬ 
tegies, including the Microviridae, can be found in 
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chapter 10. Therefore, it will only be summarized here. 
Unlike large double-stranded bacteriophages that encode 
a two-component lysis system, comprising an endolysin 
and a holin, small bacteriophages, such as the Microviridae, 
do not have the genetic capacity to encode a two-component 
system (140). To circumvent this constraint, the Microviridae, 
as well as other small bacteriophages, have evolved a small 
protein that inhibits a host cell enzyme involved in peptido- 
glycan biosynthesis (15). In the Microviridae, the E protein 
performs this function (70). Once translated, the protein is 
found associated with the cell membrane (2). The mecha¬ 
nism for E protein action has been controversial for many 
years. However, the results of recent genetic experiments 
have elucidated the mechanism of this “antibiotic-like” 
protein. 

The expression of an inducible cloned E gene leads to cell 
lysis (139). Therefore E protein is both necessary and suffi¬ 
cient for this function. However, mutant cells resistant to 
gene E expression, slyD (sensitivity to lysis), can be easily 
isolated (90). The phenotype is conferred by mutations in 
a peptidly-prolyl cis-transferase-isomerase, or PPIase (106). 
However, it is unlikely that the slyD gene product is the 
E protein target. Considering the function of PPIases, protein 
folding (33), and the observation that protein E does not 
associate with the membranes in slyD cells, it is more likely 
that E protein, with five prolyl bonds, is a PPIase substrate. 
4)X174 gene E mutants, Epos (plates on slyD), were also iso¬ 
lated. Epos proteins were expressed in an E. coli slyD host to 
identify other host cell genes that confer a lysis-resistant 
phenotype (15). The surviving colonies contained mutations 
in the mraY gene, which encodes translocase I. This enzyme 
catalyzes the formation of the first lipid-linked intermediate 
in cell wall biosynthesis (71). The results of both in vitro 
and in vivo studies strongly suggest that E protein- 
specifically inhibits translocase I catalyzed reaction (16). 
The lysis of c|)X174-infected cells requires cell growth (19). 
The similarities between E-protein-mediated lysis and 
penicillin-mediated lysis are apparent. 

Evolution and Evolutionary Studies 

In the past decade, two approaches have been taken to 
investigate Microviridae evolution. In one approach, pio¬ 
neered by the Drs. J. J. Bull and H. A. Wichman and collea¬ 
gues, viruses are placed under selective conditions, either 
high temperature or host variations, and grown for numer¬ 
ous generations in a chemostat (23,24, 31,69,137). At various 
time intervals, individual genomes are sequenced. Therefore 
the appearance and disappearance of beneficial mutations 
can be monitored. Of course, host adaptation experiments 
have identified mutations that increase attachment and 
penetration efficiency. But mutations that affect intracel¬ 
lular tropism, more specifically mutations in gene A, were 


also uncovered. Protein A adaptation would be needed to 
optimize its interaction with the host cell rep protein, which 
functions as a host cell helicase during stage II and stage III 
replication (41). The high temperature selection yielded 
several types of mutations. In addition to mutations that 
appear to affect both intracellular and extracellular interac¬ 
tions, which may be needed to stabilize macromolecular 
interactions at higher temperatures, many mutations affect¬ 
ing morphogenesis and/or the stability of the procapsid were 
isolated. Considering the metastable properties of assembly 
intermediates, these results are not surprising. 

In both selections, mutations in genetic regulatory seq¬ 
uences were also recovered. These mutations most likely 
optimize the relative level of viral proteins synthesized 
under the experimental conditions. However, it should not 
be assumed that these mutations lead to elevations in tran¬ 
scription, transcript stability, or translation. Adaptation may 
involve maintaining an optimal balance of viral components 
(48, 125). For example, the expression of a relatively stable 
protein under selection conditions may be downregulated 
while the expression of less stable proteins may be upregu- 
lated. Some recurrently recovered neutral mutations may 
also be acting on this level, changing the intracellular level 
of the encoded protein by codon usage. However, neutral 
mutations may also be acting on a structural level by alter¬ 
ing a genome secondary structure. As discussed above, the 
interplay between a genomes secondary structure and 
its interactions with the capsids inner surface may affect 
the final stages of virion morphogenesis and can create 
capsid surface distortions that reduce host cell attachment. 
In several the selections, a deletion in the F-J intercistronic 
group was recovered. This deletion was also isolated as a 
mutation that elevates the rate of host cell attachment 
and penetration (74). In addition, it acts as a second-site 
suppressor of mutations that affect the interface between 
the genome and the inner surface of the viral capsid 
(S. Hafenstein and B. A. Fane, unpublished data). 

The second area of evolutionary research has focused 
on the isolation of novel members of the family. As stated 
by Hendrix et al. (66, 67), the prevalence of double-stranded 
DNA phages and prophages—cryptic, defective, and replica¬ 
tion competent—creates an enormous pool of evolutionary 
material which can be horizontally exchanged, otherwise 
known as the moron accretion hypothesis. Consequently, 
a mosaic spectrum of related phage species has arisen. 
In contrast, the members of the Microviridae appear to fall 
into two distinct and rather distantly related subfamilies. 
Protein homologies between the two subfamilies are app¬ 
roximately 20% or less (table 11-2), a typical value when 
comparing the most distantly related members of either 
the lambda or T4-like groups (66, 67, 130). However unlike 
tailed dsDNA families, no mosaic species that bridge the 
evolutionary chasms have been isolated. The members of 
the ())X174 subfamily were isolated from y-proteobacteria, 
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Table 11 -2 Amino Acid Identities of 4>MH2K Gene Products with Chpl, Chp2, SpV4, and 0X174, and Amino Acid Homologies 

Percent amino acid identity 

Gene a - 


product 

Chp2 

Chpl 

SpV4 

Chp2/Chp1 b 

0X174-like' 

VP1 

46.9 

40.4 

38 

49.6 

19 (a3 F) 

VP2 

26.5 

21.3 

25 

29.9 

20 (a-3 H) 

VP3 

32 

27.6 

18.4 

27.3 

18 (<*3 B) 

VP4 

27.9 

22.5 

27 

22.2 

18 (G4 A) 

VP5 

39.5 

26.8 

18.4 

30.2 

20 (<*3 C) 

VP8 

32.6 

33.3 

31 

54.5 

21 (G4J) 


a Genes (and gene products) which are conserved between <|)MH2K and the chlamydiaphages. 
b Comparison between Chpl and Chp2 proteins. 

'Comparisons between 4>X174, G4 or c/3. 


and are fairly closely related. In contrast, the members of the 
second subfamily were isolated from a very diverse group of 
hosts: Chlamydia , Bdellovibrio (5-proteobacteria), and Spiro- 
plasma (20, 88, 104, 126). Despite the extreme diversity of 
these hosts, the viruses within the subfamily are remarkably 
similar. In fact, the Bdellovibrio virus 0MH2K is more closely 
related to some of the Chlamydiaviruses, than the Chlamydia 
viruses are related to each other (table 11-2), suggesting that 
species jumping may play a major role in the evolution of this 
family. 

There are several factors that may limit horizontal trans¬ 
fer in ssDNA viruses, such as small circular genomes and 
lytic life cycles that do not require recombination. In addi¬ 
tion, the small T = 1 capsid may restrict the incorporation 
of exogenous DNA sequences, or morons (66, 67). Since it is 
unlikely that these small viruses can acquire morons, all 
members of the Microviridae appear to possess preserved 
open reading frames, found mostly in overlapping genes. 
Mutations could accrete in these reading frames, or cretins, 
until a gene encoding a beneficial function is produced. 
Examples of 0X174 cretins may include lysis gene E, and 


genes A* and K, both unessential and of unclear function 
(30,129). 

Genetic maps of c()MH2K, (j)X174, and Chp2 are given 
in figure 11-5. Neither Chp2 nor 0MH2K encodes an 
external scaffolding or major spike protein, 0X174 D and 
G proteins, respectively. The loss of these genes accounts 
for the smaller 0MH2K and Chp2 genomes. The external 
scaffolding protein has at least two known functions in 
cj>X174 morphogenesis. It stabilizes the procapsids at the 
2- and 3-fold axes of symmetry and directs the placement of 
the major spike protein (35, 36). These functions are either 
not required or performed by different proteins in the 
0MH2K-like phages. First, there is no major spike protein. 
The 2-fold stabilization function may be performed by Vp3, 
the internal scaffolding protein equivalent, which in cj)X174 
self-associates across 2-fold axes of symmetry. Finally, as 
seen in the cryo-image reconstruction of SpV4, a large coat 
protein insertion loop forms spikes at the 3-fold axis of 
symmetry (29). This large insertion loop may be the relic 
of the ancestral external scaffolding or the major spike 
protein. Coding Vp3 in a normal reading frame, as opposed 
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Figure 11-5 The genetic maps 0MH2K, Chp2, and 0X174 phages. Reading frames in Chp2 and 0MH2K that encode 
homologous proteins have the same gene numbers. Chp2 open reading frames (ORFs) 6 and 7, and 0MH2K ORFs W, X, Y, Z, 
and N are cretins. 
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to a cretin, may be a related phenomenon. The B proteins 
of the (j)X174-like phages are highly diverged, yet they 
cross-function, suggesting that interactions are primarily 
nonspecific and flexible (25, 28). With the loss of the external 
scaffolding protein, internal scaffolding protein interactions 
may need to be more specific, requiring a reading frame 
unconstrained by other genes. 

Future Prospects 

Considering the history of Microviridae research, it is diffi¬ 
cult to predict its future. For example, by the late 1980s, 
Microviridae research, so popular in the 1960s and 1970s, 
was being studied by only a few research groups. Then 
came the atomic structure of the virion and procapsid, the 
discovery of the <f>MH2K subfamily, nanotechnology and 
the ability to make transgenic animals. Animals transgenic 
for the 4>X174 genome are being developed for eukaryotic 
mutagenesis experiments (133). With the elucidation of 
X-ray models, genetic maps of biochemical phenotypes 
transubstantiate from straight lines to functional domains, 
and the domains have structures. The consequences for 
morphogenetic and evolutionary research are apparent. 
The 4>MH2K-Chp2-like viruses could lead to the first DNA 
transfer system for Chlamydia, opening areas of molecular 
research which have been hindered by the lack of a labora¬ 
tory-based genetic system. The atomic structure of the cj>29 
connector represents a powerful nano-motor that must 
drive the packaging of DNA to the concentration of liquid 
crystal (116). Considering protein-DNA interactions found 
in the Microviridae, is the pentameric spike protein a compo¬ 
nent of the motor that is required to force the DNA out of 
the capsid? 0 brave new world that hath such particles in't! 
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Filamentous Phage 

MARJORIE RUSSEL 
PETER MODEL 


F ilamentous phages constitute a large family of bacterial 
viruses that infect many Gram-negative bacteria and 
even a Gram-positive bacterium (18). These long, slender 
viruses contain a circular, single-stranded DNA genome 
encased in a somewhat flexible tube composed of thousands 
of copies of a single major coat protein (figure 12-1). Two 
minor proteins at one end and two others at the other end 



seal the tips of the tube. The genome consists of a dozen or 
fewer closely packed genes and an intergenic (IG) region 
that contains sequences necessary for DNA replication and 
encapsidation. Unlike most bacterial viruses, filamentous 
phage are produced and secreted from infected bacteria 
without cell killing or lysis. Rather, they assemble at and are 
secreted across the cell membrane(s). Readers are referred 
to several other reviews on filamentous phage (4, 82, 101, 
106,112) for more comprehensive information and citations 
of the primary literature than are given here. 

Most information about filamentous phages derives from 
those that infect Eschericia coli (fl/M13/fd, and to a lesser 
extent Ike and 12-2). Structural analysis of several others 
has shown that the packing density (protein:DNA mass 
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Figure 12-1 Filamentous phage fl/Ml3/fd: genes and gene 
products, pll binds to a sequence (the + strand origin) in the 
intergenic region (IG) of double-stranded DNA and nicks 
the (+) strand; the original (+) strand is displaced by Rep 
helicase as a new (+) strand is elongated from the 3' end of 
the nick by host DNA polymerase III, using the (—) strand as 
template. pX, which is identical to the C-terminal third of pll, 
is required for the accumulation of single-stranded DNA, 
as is pV. Dimers of pV bind cooperatively to single-stranded 
DNA, which collapses the circular genome into a flexible rod 
with the packaging signal (PS) exposed at one end of the 
filament. pVII and pIX are small coat proteins located at the 
tip of the virus that is first to emerge from the cell during 
assembly. pVIII is the major coat protein, several thousand 
copies of which form the cylinder that encases the single- 
stranded DNA phage genome, pill and pVI are located at the 
end of the virion where they mediate termination of 
assembly and release of the virion from the cell membrane, 
pill is also necessary for phage infectivity. pi may hydrolyze 
ATP to promote assembly; pXI is identical to the C-terminal 
third of pi; it lacks the cytoplasmic domain and may play a 
structural role as part of an oligomeric pl/pXI complex. pIV 
is a multimeric outer membrane channel through which 
the phage exits the bacterium. 


screw axis, while class II phage like Pseudomonas phages 
Pfl and Pf3 have a simple one-start helix (107). The mecha¬ 
nism of assembly is likely to be fundamentally the same for 
both classes; perhaps differences in the primary sequences of 
the major coat proteins account for the different subunit 
packing. DNA sequences reveal the modular nature of 
phage evolution and other variations. Phages 12-2 and IKe 
are highly homologous over the two thirds of their genomes 
that encode capsid and morphogenetic proteins, whereas 
the remaining portions, containing the replication origins 
and replication genes, are unrelated (153). Although Pf3 
has a gene IV it is in a different position in the genome than 
in other filamentous phages. Vibrio cholerae CTX® lacks gene 
IV (162, 163), which encodes an outer membrane assembly 
protein that is essential in the E. coli phages; instead, it uses 
the homologous host protein, EpsD (26). EpsD, one of three 
Vibrio pIV homologs (members of the widespread “secretin" 
family), is also required for secretion of cholera toxin 
(133, 143). B5, the recently discovered filamentous phage 
that infects the Gram-positive Propionibacterium freuden- 
reichii, also lacks a gene IV (18); since Gram-positive bacteria 
lack an outer membrane, presumably neither pIV nor a 
bacterial equivalent is necessary. Whereas adjacent genes 
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encode two small minor coat proteins (pVII and pIX) located 
at the same end of the particle in E. coli phages, a single gene 
is located in the analogous region of the Pf3 and some other 
phage genomes, which may encode a functionally equiva¬ 
lent fused minor coat protein. The major coat proteins of 
the E. coli phages are synthesized with a signal sequence, 
while those of Pf3 (90), PH75, a filamentous phage oiTher¬ 
mits thermophilus (119), and B5 (18) are not. Several filamen¬ 
tous phages of Vibrio and Xanthomonas, but not those of 
E. coli or Pseudomonas aeruginosa, lysogenize their host 
(10, 93,163). Finally, CTXO is so far unique among filamen¬ 
tous phages (although not among phages in general) in 
carrying additional genes, in this case the genes that 
encode cholera toxin (163). Unless specified, the properties 
of filamentous phage described below refer to the almost 
identical fl, M13, and fd (the “Ff,” for F-specific filamentous 
phage, see “Infection” below). 


Structure of the Phage Particle 

The image of a filamentous phage obtained by atomic force 
microscopy is shown in figure 12-2. They are about 6.5 nm 
in diameter, with a length determined by the size of the 
genome, normally 6-7 kb of single-stranded DNA. The 
6400-nucleotide Ff genome is encapsidated in a 930 nm 
particle; a 221-nucleotide “microphage” variant is only 
50 nm long (152), and the presence of cloned DNA in the 
phage genome makes the particle proportionately longer. 
Although there is no theoretical limit to the amount of 
DNA that can be packaged in a single particle, as there is 
for icosahedral phages of fixed dimensions, there may be 
a practical limit; phage containing about 6 kb of additional 
DNA make small plaques, so even larger inserts would likely 
have even more deleterious effects. Even normal particle 
length is selected against when phage genes are made dis¬ 
pensable (by providing them from another source)—smaller 



Figure 12-2 Filamentous phage visualized by atomic force 
microscopy. The large size of the phage tip 
structure suggests that it is the plll-pVI end. The lollipop¬ 
like images of pill sometimes observed by electron 
microscopy may represent partially denatured pill. 


particles accumulate with genomes that contain cis-active 
replication and packaging signals but no intact genes (71). 
Much longer particles—10 or more times the normal unit 
length—can be generated by eliminating pill, the phage 
protein necessary to terminate assembly and release the 
particle (and to mediate infection), but these are non- 
infectious particles that contain multiple unit-length 
genomes (122). 

The protein tube that surrounds the single-stranded 
DNA is composed of several thousand copies of pVIII, the 
50-residue major coat protein (in the Ff), oriented at a 20° 
angle from the particle axis and overlapped like fish scales 
to form a right-handed helix (106). The filament is held 
together by interactions between the hydrophobic mid¬ 
sections of adjacent subunits; other crucial interactions 
have been postulated between the hydrophobic face of an 
N-terminal amphipathic helix on one subunit with the 
hydrophobic face of a C-terminal amphipathic helix on a 
neighboring subunit (129). Except for five surface-exposed 
N-terminal residues, each pVIII subunit forms a single, 
continuous a-helix. The positively charged residues near 
the C-terminus are at the inner surface of the tube and 
interact with phosphates of the viral single-stranded DNA 
(54,73,130). 

Electron micrographs reveal that each end of the particle 
has a distinct morphology. The blunt end contains three 
to five copies each of pVII and pIX, two of the smallest ribo- 
somally translated proteins with a defined function that 
are known (33 and 32 residues respectively). Immunological 
evidence indicates that at least some of pIX is exposed (36), 
and since N-terminal display is possible on both (48), both 
N-termini are probably at or near the surface. Both proteins 
are quite hydrophobic; the N-terminal portions of pVII from 
several phages are negatively charged, whereas there are two 
or three positively charged residues near the C-terminus 
of the pIXs. Phage assembly begins at the pVII-pIX end, and 
if either protein is absent, particle production does not 
occur (98). 

The other end of the particle is pointed, and lollipop-like 
knobs can be seen to extend from the tip in certain prepara¬ 
tions (52). This end contains about five copies each of pill 
and pVI, the proteins that mediate phage entry and exit. 
The N-terminal domain of pill binds the infecting phage to 
its cellular receptors, the tip of the F pilus and the C-terminal 
domain of TolA (11, 19), and pVI must be present in order 
for pill to be incorporated into the particle (121). These 
two proteins are also necessary for phage assembly to termi¬ 
nate, that is for newly formed phage to detach from the cell 
membrane. If either is absent, “polyphage,” which contain 
multiple phage genomes encased in particles 10-20 times 
the normal phage length, accumulate and remain tethered 
to the cell (122). A complex composed of pill and pVI can 
be isolated from phage particles (47). Although they proba¬ 
bly do associate in the cell membrane (since pVI is degraded 
in cells that lack pill), no complex could be detected, which 
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is consistent with the proposal that detachment involves 
a conformational change in the complex that increases 
its stability (121). The disposition of pVI in the particle is not 
known: pVI with fusions at the C-terminus can be incorpo¬ 
rated into phage (albeit inefficiently), suggesting that this 
portion of the 112-residue pVI can be exposed (121). The 
N-terminal domain of the 406-residue pill is surface 
exposed and is responsible for the “knob” structures (2). The 
C-terminal 132 residues of pill are necessary and sufficient 
for incorporation into the phage particle, termination of 
assembly, and release of phage from the cell; this domain is 
likely to be buried within the particle (121). 

The single-stranded phage genome is oriented and 
anchored within the phage particle by the packaging signal 
(PS), which is located in the noncoding IG region of the 
genome and is positioned at the pVII-pIX end of the parti¬ 
cle (56,168). The PS, an imperfect but extremely stable hair¬ 
pin, is necessary and sufficient for efficient encapsidation of 
circular single-stranded DNA into phage particles, whereas 
an "artificial" PS (a perfect duplex of equivalent length) is 
a poor substitute (140). Certain amino acid substitutions in 
pVII, pIX, and pi (see below) enable single strands that lack 
a PS to be encapsidated: it is not known whether the DNA is 
randomly oriented in such particles or whether some small 
duplex region serves as a secondary PS. 

Infection 

A simplified cartoon illustrating the filamentous phage 
life cycle is shown in figure 12-3. Filamentous phages use 


pili as primary receptors to infect cells. Pili are long filamen¬ 
tous structures that emanate from the cell surface. There 
are different types of pili; those utilized by the phage are 
anchored in the cytoplasmic membrane and are capable 
of retraction—the subunits depolymerize back into the 
membrane. The E. coli phages use conjugative pili, self- 
transmissible pili that mediate transfer of the plasmid that 
encodes them to recipient bacteria (147). The fl/M13/fd 
phages (which are almost identical in sequence and are 
referred to collectively as Ff) bind F pili, while IKe uses N 
or P pili. CTX® uses nonconjugative type IV pili (Tfp) (163). 
Tfp mediate adhesion to eukaryotic cells and are responsible 
for a form of solid-surface locomotion called “twitching 
motility” (164). Phage can infect cells that lack appropriate 
pili, but the process is extremely inefficient; the efficiency 
is improved 2-4 orders of magnitude by agents that concen¬ 
trate the phage or promote its adherence to the cell surface, 
such as calcium chloride and polyethylene glycol (142). 

Two surface-exposed N-terminal domains of the minor 
coat protein pill mediate infection by binding first to the 
tip of the F pilus and then to the host TolA protein (155). 
In the Ff pill, the N1 (or Dl) domain (residues 1-68) that 
binds TolA precedes the pilus-binding domain (the N2 or 
D2 domain, residues 87-217) (28), but the position of the 
pilus-binding and TolA-binding domains is reversed in IKe 
(34). The Ff and IKe pill proteins are poorly conserved and 
cannot replace one another to assemble into the heterolo¬ 
gous phage (15, 35). Ff pill has a second glycine-rich domain 
after N2 that causes cells to leak periplasmic contents and 
become hypersensitive to detergents, whereas IKe pill has 




Figure 12-3 Life cycle of filamentous phage fl /M13/fd. The phage (via pill, located at one end) binds to the tip of the F pilus. 
Pilus retraction brings the phage close to the cell surface where pill then binds to the periplasmic domain of host TolA. 
The cytoplasmic membrane-anchored TolA, Q, and R proteins mediate depolymerization of the phage coat proteins into 
the membrane (where they are available for reutilization) and entry of the phage circular single-stranded DNA into the 
cytoplasm. The single-stranded DNA is converted to a double-stranded, supercoiled replicative form by the action of host 
RNA polymerase, DNA polymerase III, and gyrase. Supercoiled RF is a template for phage gene expression and rolling 
circle replication, which generates a single-stranded DNA molecule. When sufficient pV has accumulated, pV dimers cover 
the single-stranded DNA, leaving the PS hairpin exposed at one end; the pV-single-stranded DNA complex is a substrate 
for assembly. The five coat proteins are integral cytoplasmic membrane proteins prior to their incorporation into the 
phage particle. 
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Figure 12-4 The (—) strand origin of f 1 functions as an unusual promoter. The structure of the (—) strand origin as deduced 
from footprinting and mutational studies. Adapted from (62). 


a single glycine-rich domain located between N1 and N2 
and does not induce outer membrane defects (9,15,123). 

Pili normally assemble and disassemble continuously, 
and this, possibly stimulated by phage binding, brings the 
phage close to the cell surface. Crystallographic analyses 
indicate that the N2 and N1 domains of pill interact as 
a horseshoe-shaped structure (66, 67, 100). Binding of the 
N2 domain to the pilus releases the N1 domain, making it 
available to bind to the C-terminal domain of the host TolA 
protein, TolA-III (99, 126). Mutational studies implicate the 
outer surface of the N2 portion of the “horseshoe” in F-pilus 
binding (27). TolA-III, which may contact the outer mem¬ 
brane, is tethered via a long periplasmic linker (TolA-II) to 
TolA-I, which spans the cytoplasmic membrane and inter¬ 
acts with the membrane proteins TolO and ToIR (19, 167). 
These three Tol proteins are absolutely required for phage 
infection (158, 159). How the phage penetrates the outer 
membrane and the underlying peptidoglycan layer to 
engage TolA is not known. The Tol proteins mediate depoly¬ 
merization of the phage coat proteins into the cytoplasmic 
membrane and translocation of the viral single-stranded 
DNA into the bacterial cytoplasm (20). The molecular details 
of how this is accomplished are completely obscure, though 
it has been suggested that the conversion of phage fila¬ 
ments into spheroid-shaped particles at chloroform-water 
interfaces may mimic a spontaneous event that occurs 
when the phage meets a membrane (30, 56). The precise role 
of the Tol proteins with respect to the bacterium is not 
known, but they are important for maintaining the integrity 
of the bacterial outer membrane; bacteria that lack a Tol 
protein are hypersensitive to detergents and leak their peri¬ 
plasmic contents, which mimics the effect of the presence of 
Ff pill. Recently a role for TolA in delivering lipopolysaccha- 
rides to the outer membrane has been demonstrated (49), 
and it has been proposed that the Tol proteins, which are 
analogs of the TonB system, may transmit energy from the 
proton-motive force to the outer membrane (96). 

Replication 

Once the viral single-stranded phage DNA (the (+) strand) 
enters the cytoplasm, host enzymes convert it to a double- 
stranded, supercoiled molecule (RF or replicative form). 
An RNA primer is required to synthesize the (—) strand. 


This 20 nt primer is generated by RNA polymerase, which 
initiates synthesis at an unusual site (the (—) strand origin, 
figure 12-4) located in the noncoding intergenic (IG) region 
of the (+) strand (63, 64, 72). The site consists of two adja¬ 
cent hairpins separated by a single-stranded region; one 
hairpin sequence includes a promoter-like —35 motif and 
the other has a —10 motif (72); the affinity of RNA polymer¬ 
ase to this structure is much greater than to an authentic 
promoter such as the lacUV5 promoter (62). The RNA 
primer is extended by host DNA polymerase III, the newly 
replicated strand is ligated, and the double-stranded product 
is supercoiled by gyrase. RF is the template for phage gene 
expression. Phage gene expression—in particular, synthesis 
of pH, a site-specific nicking-closing enzyme—is necessary 
for further replication of the initial RF. pH nicks the (+) 
strand of the RF at the (+) strand origin in the IG region 
(53, 72). This is a complex site that includes sequences for 
binding IHF. pH, nicking, and a site necessary for termina¬ 
tion of synthesis. The 3' end of the nick is elongated by host 
DNA polymerase III using the (—) strand as template (61). 
The original (+) strand is displaced by Rep helicase as 
the new (+) strand is synthesized, and when a round of 
replication is complete the displaced (+) strand is recircular¬ 
ized by the nicking-closing activity of pH and again 
converted to RF. 

Early after infection when the concentration of the 
phage-encoded single-stranded DNA-binding protein (pV) is 
low, newly synthesized single strands are immediately 
converted to RF, and both RF and phage proteins increase 
exponentially. As its concentration increases, pV binds coop¬ 
eratively to newly generated (+) strands. This covers them 
and prevents polymerase access, thereby blocking their 
conversion to RF. pX, which is identical to the C-terminal 
111 residues of pH. is required for the stable accumulation 
of single strands at this stage, but the mechanism by which 
it acts is not known (45, 46). pV is dimeric, with the interac¬ 
tion surface of the subunits opposite the DNA-binding 
surface (88, 148). Upon binding, the back-to-back arrange¬ 
ment of the dimers collapses the circular single strand 
into a rodlike structure (51). The DNA is oriented in this 
complex, with the PS hairpin protruding from one end (5). 
This presumably occurs because pV binds double-stranded 
DNA only weakly. The fact that Y-shaped or other more 
complicated pV-ssDNA structures (which would indicate 
multiple initial binding events) are not seen in vivo suggests 
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that there may be a favored nucleation point for pV binding. 
pV does bind preferentially to the DNA analog of the site 
on gene II mRNA (108) at which it represses translation 
(44,109), but this site is 400 nucleotides away from the PS; 
ideally, a nucleation site should be adjacent to the PS so 
that cooperative interactions between pV dimers would 
“zip” the opposing portions of the single-stranded circle into 
the rod. 


Genes and Gene Expression 

The Ff genome (~6400 nts) contains nine closely packed 
genes and one major noncoding region (the IG) which 
contains the (+) and (—) strand replication origins and 
the PS (figure 12-1). Two of the phage genes encode two 
proteins, for a total of 11 phage-encoded proteins. Genes I 
(124) and II (45) have internal translational initiation 
sites from which in-frame restart proteins are produced. In 
each case, both the full-length and the restart protein 
(whose sequence is identical to the C-terminal third of the 
full-length protein) are necessary for successful phage 
production. The relative levels of pH and pX determine the 
distribution of phage DNA species in infected hosts (87). The 
absence of pH totally prevents phage DNA synthesis, while 
the absence of its restart partner, pX, prevents accumul¬ 
ation of (+) strands (46). The pI/pXI pair is described below. 
Of the 11 phage-encoded proteins, three (pH, pX, pV) are 
required to generate single stranded DNA, three (pi, pXI, 
pIV) are required for phage morphogenesis, and five (pill, 
pVI, pVII. pVIII, pIX) are components of the phage particle. 

Unlike the situation in phages with larger genomes, 
where temporal regulation of gene expression is the rule, 
filamentous phage proteins are synthesized concurrently. 
Diverse mechanisms ensure that each is produced at an 
appropriate rate. There are differences in promoter and ribo¬ 
some binding site strength or accessibility (7, 8). Gene VII 
is translated from an inherently defective translation initia¬ 
tion site and is coupled to its upstream gene (76, 77, 182). 
A weak rho-dependent termination signal at the beginning 
of gene I limits its transcription, and a large number of infre¬ 
quently used codons reduces the rate of its translation 
(70). At the other end of the spectrum, overlapping tran¬ 
scripts from multiple promoters (there are only two termina¬ 
tors) and multiple RNA processing events by RNaseE (157) 
increase the abundance of RNAs for the genes closest to the 
terminators (50). This results in high levels of pV and pVIII, 
the proteins that are required in the greatest quantities. 
IKe exhibits a similar pattern of overlapping mRNAs with a 
common 3' end, but surprisingly, the only highly conserved 
regulatory element is the rho-independent terminator that 
generates one of the common 3' ends (156). 

The rates of phage protein and DNA synthesis slow 
at later times after infection, when high concentrations of 


pV sequester the (+) strands and prevent their conversion 
to RF. The reduced amount of RF (i.e., template) results in 
a lower level of gene expression. In addition, excess pV 
represses the translation of pH and pX, and the reduction in 
pH levels leads to even lower rates of (+) strand synthesis 
(44, 46). pV which is a nonspecific single-stranded DNA 
binding protein, represses translation by binding specifi¬ 
cally to a GT-rich tetraplex structure that is present in both 
genell and geneX mRNAs, just upstream of their translation 
initiation sites (109, 115). The net result of these control 
systems (under optimal conditions) is that a steady-state 
level of phage DNA and proteins is maintained; synthesis 
is balanced by secretion of progeny phage and by the contin¬ 
ued growth and division of the infected cells. Thus phage 
production continues indefinitely at a linear rate. 


Phage Assembly 

Filamentous phage assembly is a secretory process. Assem¬ 
bly occurs in the cytoplasmic membrane, and nascent 
phages are secreted from the cell as they assemble 
(figure 12-5). All eight of the phage-encoded proteins that 
are directly involved in assembly are integral membrane 
proteins. This includes three nonvirion proteins—pi and 
its restart partner, pXI, in the cytoplasmic membrane 
(69) and pIV in the outer membrane (12)—and the five viral 
coat proteins, which reside in the cytoplasmic membrane 
prior to their incorporation into phage (36). Two (pill and 
pVIII) are synthesized as precursors and, after signal 
sequence cleavage, span the membrane with their C-termini 
in the cytoplasm. The orientation of the other coat proteins 
is not known with certainty. Like the major coat protein, 
pVII and pIX contain one or more N-terminal negatively 
charged residues and one or more C-terminal positively 
charged residues, which according to the “positive inside” 
rule articulated by von Heijne (161), suggests that, like 
pVIII, they span the membrane with their C-termini in the 
cytoplasm. Furthermore, successful display of proteins 
fused to the N-termini of pVII and pIX appears to require 
a signal sequence (48), which also suggests that the 
N-termini are periplasmic. pVI is particularly hydrophobic 
and could span the membrane more than once. 

Progeny phage particles first appear in the culture 
supernatant about 10 minutes after infection at 37°C. Their 
numbers increase exponentially for about 40 minutes, after 
which the rate becomes linear. About 1000 phage per cell 
are produced in the first hour. Under optimal conditions, 
the infected cells can continue to grow and divide—and 
produce phage—indefinitely. But even modest perturba¬ 
tions of the phage life cycle can lead to the eventual death 
of infected cells and formation of clear rather than the 
normal turbid plaques (140). Most nonproductive infec¬ 
tions result in death of the infected bacteria, presumably 
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Figure 12-5 A hypothetical model for assembly and extru¬ 
sion of filamentous phage. Initiation of phage assembly 
occurs when a cytoplasmic region of pi recognizes the 
packaging signal (PS), which protrudes from one end of the 
pV-ss DNA complex. Assembly occurs at sites where the 
cytoplasmic and outer membranes are brought together 
by trans-envelope interactions between pi and pXI in the 
cytoplasmic membrane and pIV in the outer membrane. 
Assembly starts with the addition of three to five copies 
each of two small “tip” proteins (pVII and pIX). During 
elongation, which requires host thioredoxin (TrxA) and ATP 
hydrolysis (presumably by pi), pV is stripped from the DNA 
and several thousand copies of the major coat protein, 
pVIII, are added to form the phage tube, which surrounds 
the single-stranded DNA. When the end of the DNA has 
been reached, pill and pVI are added, and a conformational 
change in pill detaches them from the membrane. The 
completed phage is thereby released, and extruded 
through the pIV channel. 

due to the accumulation of phage DNA in the cytoplasm 
and/or capsid proteins in the cytoplasmic membrane. 
The exceptions are mutants that lack either pH or pill. The 
absence of pH prevents phage DNA replication; as a conse¬ 
quence, little phage gene expression occurs and the phage 
DNA is lost by dilution. When pill is absent, phage filaments 
are still efficiently assembled and extruded across the 
membrane but they remain attached to the cell; thus 
phage DNA does not accumulate in the cytoplasm and 
structural proteins do not accumulate in the cytoplasmic 
membrane (122). 

Phage assembly occurs at distinct membrane assembly 
sites. Assembly sites have been visualized by electron micro¬ 
scopy as regions where the cytoplasmic and outer mem¬ 
branes are in close contact (97). The assembly site is a 
trans-envelope complex of the three phage-encoded 
morphogenetic proteins membrane; genetic (24, 136) and 
biochemical (40) results indicate that the C-terminal 
periplasmic domain of pi and pXI (cytoplasmic membrane- 
spanning proteins) interacts with the N-terminal peri¬ 
plasmic domain of pIV (an outer membrane protein). The 
assembly site forms independently of other phage proteins, 
and when its formation is blocked by mutation, pi becomes 
susceptible to cleavage by a host protease (40). 


pIV forms a large multimer in the outer membrane 
that is composed of 12 to 14 identical subunits (83,116,134). 
Like the outer membrane porins, the multimer is extremely 
resistant to dissociation by detergents (94). As illustrated in 
cartoon form in figure 12-5 (but based on actual aligned 
images), the pIV multimer is somewhat barrel-shaped in side 
view; in top views, it is cylindrical with a central cavity 
(~8 nm in diameter) sufficient to accommodate an emerging 
phage (95). This is a much larger diameter than that of 
known outer membrane channel-forming proteins (86), 
but there appears to be some density within the cavity 
(116), which could explain why the presence of wild-type 
pIV does not disrupt the integrity of the outer membrane. 
Certain amino acid substitutions in pIV however, do render 
bacteria hypersensitive to detergents and allow entry of 
foreign substances into the bacterial periplasm (102, 139). 
In planar lipid bilayers, one such mutant pIV forms highly 
conductive channels; wild-type pIV has similar conductivity, 
but its probability of being open is much lower than that 
of the mutant (102). Phage pass through the pIV channel, as 
shown by the ability of tethered phage (that lack pill) to 
block the pIV-dependent diffusion of maltodextrins across 
the outer membrane (103). Nothing is known about the 
channel interior, except that it can accommodate hetero¬ 
logous pVIII from IKe (132), as well as an occasional pVIII 
subunit carrying a large N-terminal extension (81) or pVIII 
uniformly substituted with short (6-8 residue) N-terminal 
extensions (55, 74). The size and/or properties of the channel 
may be a limiting factor in phage display. 

The N-terminal third of pIV forms a trypsin-resistant 
domain that extends into the periplasm (12). Although pIV 
from IKe cannot replace 11 pIV to assemble fl phage (132), 
a chimeric IKe-fL pIV could, showing that pIV specificity is 
determined by this domain (24). Genetic analysis (136) and 
the poor conservation of the periplasmic domains of pI/pXI 
from IKe and 11 suggest that the periplasmic domain 
of pIV interacts with pI/pXI. The C-terminal half of pIV 
mediates association with the membrane (138). Proteins 
homologous to this region of pIV (a span of about 200 
residues) have been identified in many Gram-negative 
bacterial species and are called “secretins.” Bacterial secre- 
tins are essential components in either of two different 
specialized protein secretion systems (135). Both type II and 
type III systems secrete proteins across both bacterial 
membranes and, like pIV the homologs form multimeric 
rings (133,154). 

pi and pXI form an equimolar multimeric complex 
composed of about five or six copies of each (39); its shape 
and dimensions are not known. In the absence of the other 
phage proteins, pI/pXI causes membrane depolarization 
(69, 70), which suggests that the complex may also be a 
channel. pXI, the product of translational reinitiation at 
codon 241 in gene I, is required (in addition to pi) for phage 
production (124); however, despite the identity of the 
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sequences they share, certain mutations that are tolerated in 
pXI do not support phage assembly if they are present in both 
pi and pXI (58). The cytoplasmic N-terminal domain of pi 
from all known filamentous phages contains a nucleotide¬ 
binding motif. The integrity of this motif is essential for 
11 phage assembly (40). Furthermore, 11 phage assembly 
requires ATP hydrolysis (41). A likely inference is that pi is 
an ATPase. The pi proteins of Ff and IKe are not interchange¬ 
able (132), but if pi and pIV are swapped simultaneously, 
heterologous phage assembly occurs, indicating that only 
homologous protein pairs interact (136). 

Particle assembly requires the assembly site (pi, pXI, and 
pIV), an appropriately presented single-stranded DNA mole¬ 
cule, the major coat protein, two minor coat proteins (pVII 
and pIX), and host-encoded thioredoxin. Genetic analyses 
suggest that pVII and pIX interact with the PS, which 
protrudes from the pV-ssDNA complex, that the PS also 
associates with the cytoplasmic domain of pi (140), and that 
pi interacts with thioredoxin, a small, cytoplasmic protein 
known as a potent reductant of protein disulfides (137). 
Thioredoxin may confer processivity to the elongation reac¬ 
tion, as it does in phage T7 DNA replication (6). Its redox 
activity is dispensable for both processes (141). Phage elonga¬ 
tion involves the successive replacement of pV dimers by 
the membrane-embedded coat proteins and the simulta¬ 
neous translocation of the DNA across the membrane. 
Perhaps the pI/pXI complex is a channel that the coat 
proteins must penetrate to gain access to the DNA; alterna¬ 
tively, the complex may be a “screw” that winds the 
DNA across the membrane, picking up coat proteins on 
the way. 

Phage elongation continues until the end of the viral 
DNA has been coated by pVIII. If either pill or pVI is absent, 
the largely extracellular phage particle remains tethered 
to the cytoplasmic membrane where it remains compe¬ 
tent to resume elongation when another pV-ssDNA complex 
enters the assembly site: ultimately, tethered phage filaments 
10 times or more the length of a normal phage particle accu¬ 
mulate (122). Even when pill and pVI are present, about 5% 
of progeny phage particles are double length. Secondary 
rounds of elongation do not require reinitiation: single- 
stranded DNA without a PS can be efficiently incorporated 
as a passenger, behind DNA that does contain a PS (140). 

Upon incorporation of the membrane-embedded pIII-pVI 
complex at the terminal end of the nascent phage particle, 
these proteins are thought to undergo a conformational 
change that stabilizes them and detaches them from the 
membrane (121). A fragment containing only the C-terminal 
83 residues of pill is sufficient to bind pVI and incorporate 
the incomplete dimer onto the particle, but it cannot detach 
the phage from the cell. Release of the phage requires 
a slightly longer (93 residue) C-terminal segment of pill. 
A still longer portion of pill (the 132 C-terminal residues) 
is required for the formation of stable virus particles. 


Host Responses to Filamentous 
Phage Infection 

Infection by filamentous phage induces a transient altera¬ 
tion in phospholipid metabolism: the rate of phosphat- 
idylglycerol and cardiolipin synthesis increases and 
phosphatidylethanolamine synthesis decreases (178). This 
effect is due to the accumulation of pVIII in the membrane, 
which inhibits the activity of E. coli phosphatidylserine 
synthetase (17). 

Infection also affects the phosphorylation level of several 
host proteins: phosphorylation of the heat shock protein 
DnaK and at least seven other proteins increases, while 
that of several other proteins declines (127). 

Whether expressed as a consequence of phage infec¬ 
tion or a cloned gene, pIV induces the high-level synthesis 
of a set of bacterial proteins called phage shock proteins 
(13, 14). The psp genes are so named because their expres¬ 
sion is induced by phage infection and by other shocks, 
such as heat and osmotic shock (14). Transcription of the 
psp genes is also induced by bacterial homologs of pIV (59, 
174) and occurs as a consequence of improper localization 
(60). Similarly, when delivery of pIV to the outer membrane 
was improved, psp induction disappeared (23). How the 
signal is transduced is not known. 

The psp genes (pspABCE) are arranged in an operon 
controlled by ct 54 (111). Expression of the operon requires 
PspF, a constitutively active transcriptional activator en¬ 
coded by a gene adjacent to and divergently transcribed 
from the pspABCE operon (32, 80). By binding to the 
upstream activating sequences necessary to activate the 
psp operon, PspF obscures its own a 70 promoter, thereby 
autogenously repressing its own synthesis (79). PspA nega¬ 
tively regulates the operon. Its failure to bind to DNA 
suggested that it might act by binding to PspF (31), and 
this has recently been confirmed (1). PspB and PspC coopera¬ 
tively activate the operon, perhaps by antagonizing PspA- 
controlled repression (169). Both PspB and PspC bind to 
PspA in vitro, but PspB binds only when PspC is present 
(1). Transcription of psp is also dependent on IHF, which 
bends the DNA upstream of the promoter, thereby position¬ 
ing PspF with respect to ct 54-RNA polymerase bound at 
the promoter (33, 170). The function of the Psp proteins, 
other than to regulate their own expression, is not clear. 
The psp operon is highly induced in late stationary phase 
(171) and when the Sec protein secretion system is blocked 
(84). Bacteria that lack the pspABC genes show a decreased 
ability to survive in stationary phase under alkaline condi¬ 
tions (171). These observations suggest a role, probably 
for PspA, in maintenance of endogenous energy sources 
(84, 171). These mutant phenotypes are fairly subtle, but 
in Yersinia enterocolitica, the psp genes are required for viru¬ 
lence and for growth in vitro when the Ysc type III secretion 
system is produced (25). 
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Homologs of PspA have been identified in several 
higher plants, where they localize to the inner and thylakoid 
membranes of chloropiasts (89, 92); a disruption mutant 
of vippl, which encodes the homolog, formed aberrantly 
structured thylakoid membranes (175). 

Filamentous Phage Display 

Phage display is an extraordinarily powerful method of 
linking phenotype and genotype in the selection of proteins 
or peptides (see also chapter 44). The principle is very simple: 
a DNA sequence is inserted into a gene encoding a phage 
coat protein so as to make a fusion protein in which the 
coat protein now “displays” the protein product of the 
inserted sequence. If a ligand is bound to a solid support, 
those phages expressing a protein or peptide that binds 
to the ligand will be immobilized, the remaining phages 
can be washed away, and the bound phage released and 
amplified by another cycle of growth. This process, called 
panning, can then be reiterated. Under the best of circum¬ 
stances an enrichment of 10 4 to 10 5 can be achieved at 
every cycle, and usually four or five cycles of binding and 
regrowth suffice to produce one or a few selected phage 
carrying inserts of high affinity. 

Most phage display utilizes libraries that contain at least 
some random sequences. Random peptides may be sand¬ 
wiched in between a constant framework, constrained by 
paired cysteine residues, or placed in an exposed loop of 
a suitable protein scaffold. Often sequences of antibody 
combining sites are randomized. Antibody FABs can be 
expressed, either as single-chain antibodies (in which the 
two chains are covalently joined by a linker) or as individual 
chains (177). The other chain, if equipped with a signal 
sequence, can be transported into the periplasm, and there 
will pair with its partner to form an FAB (3). 

This description is, of course, one for an idealized case. 
Although the method has worked as described in many 
instances, complications and difficulties show up, and 
many routes have been found to overcome or circumvent 
the problems. The utility of the method is such that it 
has become a cottage industry. Since its introduction by 
George Smith in 1985 (149), thousands of phage display 
papers have appeared in the literature, and both the 
methodology and the results are reviewed regularly. 

Initial cloning was into gene III, which encodes the 
minor protein that helps to terminate assembly and that 
is required for infection of a new bacterial host. The original 
cloning sites were at or near the N-terminus (which is promi¬ 
nently displayed on the surface of the phage) at positions 
in which relatively small inserts do not interfere with any 
of the proteins functions (figure 12-6C). Larger proteins 
do interfere with function but the phages often are still 
infective. For pill display of larger proteins, the chimeric 


protein is usually expressed from a plasmid that, besides 
its own origin and antibiotic resistance gene, bears a phage 
origin of replication and a packaging signal. When a cell 
carrying such a plasmid is infected with a helper phage, 
both phagemid and helper phage DNA are encapsidated and 
the progeny particles carry a mixture of the wild-type 
protein and the fusion protein (figure 12-6B). It is often desir¬ 
able to control expression of the fusion protein so that at 
most one copy will be present in any particle; when this is 
the case the panning procedure will usually lead to the 
isolation of phages with the highest affinity to the target. 
When more than one copy of the fusion is expressed, avidity 
due to a range of multiplicities can lead to the isolation of 
binders of lower affinity. 

Il’ichev et al. (75), to the surprise of most phage bio¬ 
logists, found that peptides could be added not only to 
the N-terminus of pill but also to pVIII, the major coat 
protein, of which there are about 2700 copies in a wild-type 
Ff phage (figure 12-6B). Similar observations were subse¬ 
quently reported by several other laboratories (38, 55, 104). 
When the fusion protein is the only source of phage coat, 
peptides of length 6 seem to be generally tolerated, while 
only a subset of longer peptides leave the coat functional 
(74). If wild-type coat protein is supplied, much longer pep¬ 
tides may be presented on the phage surface (55), and indeed 
even FABs are displayed, although at much lower copy 
number than peptides (81) (figure 12-6C). 

Because of the very high copy number of pVIII pep¬ 
tide display, phage have been used as immunogens (55,110, 
181). They can also serve as tools with which to identify rela¬ 
tively low affinity peptides whose binding might not be 
strong enough to withstand the washing procedures. 
Sequences that encode weak binders are then often muta- 
genized and reselected, in the expectation, often achieved, 
that higher affinity binders will be found amongst the 
progeny (37, 180). Such methods attempt to recapitulate 
in principle the workings of the immune system, in which 
initial, relatively low affinity binders in immunoglobu¬ 
lins that form multimers are followed, after suitable selec¬ 
tion, by binders of higher affinity carried on monomeric 
species. 

There are constraints on display at the N-terminus of 
either pill or pVIII. The fusions must be compatible with 
export through the host’s inner membrane. The insertion 
in pill or pVIII must be between a signal sequence and 
the mature part of the protein, and the inserts must be in 
frame with both the signal sequence and the mature protein 
and should not contain stop codons. When the sequence 
encoding the displayed protein is a synthetic construct, the 
frame and stop codon limitations may not be a problem, but 
if the inserted sequence is random, or comes from random 
DNA fragments, the phase and stop codon limitations are 
manifest, since for a random fragment only one eight¬ 
eenth can be expected to be oriented correctly and in 
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Figure 12-6 Outline of several filamentous phage display strategies. A: pVII and pIX display: interacting proteins fused to their 
N-termini (signal seguences upstream, wild-type pVII and pIX provided). pVIII display: large proteins fused to the N-terminus 
of pVIII displayed at low copy only (wild type pVIII provided). pVI display: proteins fused to the C-terminus displayed, though 
inefficiently (wild type pVI provided), pill display: short peptides at the N-terminus of pill can be uniformly displayed without 
disrupting infectivity. B: pVIII display: short peptides at the N-terminus of pVIII can be uniformly displayed without disrupting 
virus assembly; pill display: large proteins fused to the N-terminus of pill (wild-type pill provided). C: Fos-Jun C-terminal 
display: a Fos-protein X N fusion associates with N-terminally displayed Jun-plll in the periplasm, and the entire sandwich 
incorporates into virions that contain the X N DNA seguence. Selectively infective phage (SIP): protein X a fused to the 
N-terminus of the C-domain of pill and protein Y a fused to the C-terminus of the N1-N2 domains of pill; X a Y a interaction 
makes particles infectious. See chapter 44 for additional discussion of phage-mediated protein display. 


frame with both the signal sequence and the downstream 
phage gene. 

A number of ways, some very ingenious, have been found 
that avoid these problems. In order to make fusions with 
a free C-terminus, Crameri and Suter (22) used Fos and Jun, 
which bind tightly, as an adapter pair (figure 12-6C). They 
constructed a vector that expresses a Jun-pIII fusion and 
carries a cloning site arranged so that polypeptides encoded 
by cloned DNA are fused to the C-terminus of Fos. Both Fos 
and Jun were engineered to contain unpaired cysteine resi¬ 
dues, which form disulfides and prevent exchange between 
different Jun-Fos pairs. This permitted display of proteins 
from cDNA libraries, which often contain stop codons that, 
if inserted into gene III, would prevent expression of its 
C-terminal region. 

Jespers et al. (78) found that fusions to the C-terminus 
of pVI were accessible, albeit that the presentation was 
much less efficient (circa 10~ 3 ) than fusion of the same 
epitope to the N-terminus of pill (figure 12-6C). More 
recently, Sidhu and coworkers have successfully cloned on 
the C-terminus of pill (43). There is some evidence that 
in wild-type phage the C-terminus of pVI is not available to 
antiserum (36); it is also clear from proteolytic and other 
studies that at least most of the C-terminus of pill is 
embedded in the phage particle (2, 113, 122). Thus these 


surprising results probably reflect much more structural 
plasticity in the phage than had previously been suspected, 
rather than indicating where the actual C-termini of these 
proteins are in the normal virion. 

The plasticity hypothesis (or perhaps degeneracy) is sup¬ 
ported by work that showed that the requirements for the 
incorporation of a pVIII fusion protein are quite limited 
and that very extensive mutagenesis is compatible with 
display and in some instances can much improve it (146, 
173). These observations led to the construction of a 
“minimal” coat protein that retains only nine non-Ala resi¬ 
dues and yet is incorporated into phage (129). A “reversed” 
major coat protein was also constructed, in which the 
pattern of residues found to be necessary from the mutagen¬ 
esis studies was inverted in the N-to-C-terminal direction 
(172). Both the “minimal” and “reversed” proteins were 
selected to display a ligand. Neither the “minimal" nor the 
“reversed” protein is, by itself, sufficient to support phage 
assembly; they require a source of wild-type (or close to 
wild-type) coat protein. Hence, as had been previously 
noted in a different context (55), the requirements for 
incorporation of some coat monomers into phage are 
simpler than those governing complete assembly. 

Gao et al. (48) have demonstrated that by adding a signal 
sequence and linkers to pVII and pIX, the two very small 
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proteins at one end of the phage, proteins can be fused to 
their N-termini and displayed on the phage surface 
(figure 12-6A). When a V L sequence was fused to gene IX, 
and a V H to gene VII. the phages carrying these fusion 
proteins bound to immobilized antigen and also had cata¬ 
lytic antibody activity (48). In control experiments, phage 
carrying only one of the fusions did not bind the antigen. 
These experiments establish the possibility of displaying 
heterodimeric arrays such as FABs, confirm the location 
and orientation of pIX, and establish the localization and 
orientation of pVII in the virion. 

Display has also been adapted to the identification of pairs 
of interacting proteins or peptides, even though they may 
both be unknown. The underlying principle is that the 
N- and C-terminal parts of pill are expressed from different 
vectors or different sites in one vector. Each half of pill is 
fused to an interaction candidate. Only if the two parts are 
brought together by interaction between the interaction 
candidates will the resulting phage be infective (29) 
(reviewed by Spada et al. 151). 

Phage display has been used in varied ways, too numer¬ 
ous to cite in any detail. The two most frequent uses have 
been to find peptides that bind to particular ligands, and to 
search for or generate antibodies with desirable characteris¬ 
tics. Phages decorated with peptides have been used as spe¬ 
cific immunogens. Antibodies of very high affinity have been 
generated from either immunized or naive libraries. Linking 
the phage to a support and requiring cleavage of a peptide 
for its release has defined protease substrates. Hormones of 
higher or altered affinity have been selected. Catalytic anti¬ 
bodies have been developed. Recently, displayed peptides 
have been used for "in vivo” phage display, in which a library 
of peptide fusions is injected into a mammal, and phage 
which are found in particular organs or blood vessels are 
isolated, in the expectation that they provide peptides that 
can "home in” on their target (118,131). 

There are more than 100 reviews that summarize dis¬ 
play work, including chapter 44 in this book. We cite some 
relatively recent ones (16, 21, 42, 57, 65, 68, 85, 105, 117, 120, 
144, 166, 177), but there are many others. The book by 
Barbas et al. (4) provides help with the methodology, as do 
a number of papers in Methods in Enzymology (91,114,125, 
128,145,150,160,165,176,179). 
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T here are six extremely similar phage isolates, from 
different parts of the world, that all infect Gram¬ 
negative bacteria harboring a conjugative plasmid. One of 
these viruses is PRD1. Viruses with this morphology (pro¬ 
tein capsid surrounding a membrane vesicle containing 
the linear double-stranded DNA genome) also infect Gram¬ 
positive hosts. These viruses have been classified together 
forming the Tectiviridae family (2). There have been two char¬ 
acteristics that have drawn considerable interest to these 
viruses. Firstly, the linear, DNA genome has inverted ter¬ 
minal repeats, a covalently bound terminal protein at the 
5' ends, and replicates by a protein-primed, sliding-back 
mechanism. Secondly, due to their membranes, these 
viruses are considered to be relatively simple model systems 
for the study of membrane biogenesis, structure, and assem¬ 
bly in a well-characterized Escherichia coli background. 

As knowledge of the PRD1 structure accumulates, a third 
line of interest is building up. It appears that the replication 
mechanism, capsid architecture, major coat protein fold, 
and vertex structure strongly resemble those of human 
adenovirus (9, 11, 13, 19, 55). This has led to a proposal 
that these two viruses belong to the same lineage originating 
from a common ancestor that precedes the division of the 
bacterial and eukaryotic domains of life (12). This would 
mean that viruses are old, maybe older than the divergence 
of cellular life into three domains. It also appears that 
there are other viruses, infecting gram-positive bacteria, 
lower eukaryotes, and maybe archaea that are candidates 
for this adeno-PRDl virus lineage. Using this structure- 
based comparison it has been possible to propose additional 
viral lineages that have members infecting hosts from differ¬ 
ent domains of life (4,12). It should be noted that sequence 
comparison does not allow the identification of these func¬ 
tional and structural similarities due to the long evolution¬ 
ary time span involved. 

Previous literature includes three comprehensive 
reviews on PRD1 and related tectiviruses (5, 14, 47). These 


summarize the history and progress made, as well as 
the basic characteristics of these viruses. Therefore we 
only repeat valid information from these reviews when 
it is necessary, and concentrate on the progress made 
since 1994. 

Genome 

Recently the genome sequence of PRD1 was verified by 
automated polymerase chain reaction (PCR)-based cyclic 
DNA sequencing (GenBank Accession No. AY848689). 
Compared with the previously published sequence, which 
was determined at 37°C from plasmid templates (10), a few 
differences were observed. Due to two single-nucleotide 
insertions and one deletion, the sequence of gene XI is 
altered making the corresponding protein longer than 
previously published. The rest of the observed differences 
were single base changes that had no effect on the amino 
acid sequence. The length of the refined genome is 14,927 bp. 

The genes and open reading frames (ORFs) encoded 
by the genome are presented in figure 13-1 and table 13-1. 
The nomenclature follows that given previously (10, 45): if 
an ORF has been shown to encode a protein it is considered 
to be a gene and it has been given a Roman numeral. 
The original ORF classification has been included in 
table 13-1 in order to make it easier to follow the previously 
published literature. The genes are organized into five oper- 
ons. The early operons OE1 and OE2, which are transcribed 
inward from the linear genome ends, express the genome 
terminal protein, DNA polymerase, and two single-stranded 
DNA-binding proteins involved in replication. With a few 
exceptions, the genes for the structural components of the 
virion and the factors involved in assembly are organized 
according to their function into three late operons. Operon 
OL1 contains the genes for the spike-complex proteins, OL2 
carries the genes for the nonstructural assembly factors, 
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Figure 13-1 Phage PRD1 genome. A: Functional organization of the PRD1 genome including the promoters (P) and potential 
transcription terminators (open lollipop, forward direction; filled lollipop, reverse direction). The genome is organized into 
five operons. The early operons (OE1 and OE2) are transcribed inward from the genome ends. The three late operons 
(OL1-OL3) in the middle of the genome are transcribed in the same direction as OE1. B: The genes are identified 
by Roman numerals and ORFs by lower case letters according to (10). See thebacteriophages.org/frames_0130.htm for 
version of this figure. 


Table 13-1 PRD1 Genes, Open Reading Frames (ORF), and Corresponding Proteins 


Gene 3 

ORF b 

Coordinates in 
PRD1 genome b 

Protein 

Mass (l<Da) c 

Description d 

VIII 


233..1012 

P8 

29.5 

Genome terminal protein (N) 

1 


1016..2677 

PI 

63.3 

DNA polymerase (N) 


(ORF a) 

2415..2495 


3.1 


XV 


2679..3128 

PI 5 

17.3 

Muramidase (L) 

II 


3128..4903 

P2 

63.7 

Receptor binding (S) 


(ORF b) 

3453.3587 


5.1 


XXXI 

(ORF c) 

4907..5287 

P31 

13.7 

Pentameric base of spike (S) 


(ORF d) 

5103..5294 


7.2 


V 


5287..6309 

P5 

34.2 

Trimeric spike protein (S) 

XVII 


6328-6588 

PI 7 

9.5 

Assembly (A, N) 

XXXIII 

(ORFf) 

6578-6784 

P33 

7.5 

Assembly (A, N) 

VI 


6784-7284 

P6 

17.6 

Minor capsid protein. DNA packaging (C, P) 

X 


7029-7640 

P10 

20.6 

Assembly (A, N) 

IX 


7637-8320 

P9 

25.8 

Minor capsid protein. DNA packaging 






ATPase (C, P) 


(ORF i) 

8332-8460 


4.5 


XX 

(ORF j) 

8460-8588 

P20 

4.7 

DNA packaging (M, P) 

III 


8595-9782 

P3 

43.1 

Major capsid protein (C) 


(ORF h) 

9427-9681 


9.2 


XXII 

(ORF k) 

9801-9944 

P22 

5.5 

DNA packaging (M, P) 


(ORF 1) 

10044-10166 


4.4 


XVIII 

(ORF m) 

10168-10440 

PI 8 

9.8 

DNA delivery (M) 

XXXII 

(ORF n) 

10440-10604 

P32 

5.4 

DNA delivery (M) 

XXXIV 

(ORF o) 

10617-10823 

P34 

6.7 

(M) 

XXX 

(ORF p) 

10833-11087 

P30 

9.0 

Minor capsid protein (C) 


(ORF q) 

11090-11200 


4.2 


XI 


11202-11825 

P11 

22.2 

DNA delivery (M) 

XVI 

(ORF s) 

11836-12189 

PI 6 

12.6 

Infectivity (M) 

VII 


12190-12987 

P7 

27.1 

DNA delivery, Transglycosylase (L, M) 

XIV 


12535-12987 

PI 4 

15.0 

DNA delivery (M) 

XXXV 

(ORF t) 

12984-13337 

P35 

12.8 

Flolin (L) 


(ORF u) 

13390-13692 


10.6 



(ORF v) 

13616-13888 


10.2 


XIX 


14132..13848 5 

PI 9 

10.5 

ssDNA binding protein (N) 

XII 


14687..14205 5 

PI 2 

16.6 

ssDNA binding protein (N) 


a ORFs shown to code for functional proteins are classified as genes and are given Roman numerals. 
b GenBank Accession No. AY848689. 

c The mass does not include the initial methionine if not present in the mature protein. 

d N, nonstructural early protein; M, integral membrane protein based on transmembrane helix prediction and location in the viral membrane; S, spike complex 
protein; A, assembly protein; P, packaging protein; C, capsid protein; L, lysis protein. 
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and OL3 is responsible for the coding of the viral capsid 
and membrane proteins (figure 13-1, table 13-1). 

Structure 

Early studies on the structure of PRD1 have focused on eluci¬ 
dating its architecture by analyzing the virion by dissocia¬ 
tion, the effects of mutations, and the characterization of 
recombinant proteins (reviewed in 5). These approaches 
continue to be very useful (9,17,19, 27-29, 34, 51, 53, 54, 64) 
and have now been complemented by X-ray crystallography 
(7,11,12, 69, 70), Raman spectroscopy (3, 65, 66), mass spec¬ 
trometry (33), small-angle X-ray scattering (63), antibody 
labeling (26) and cryo-electron microscopy and image 
reconstruction (13, 55, 57, 58). 

Overall Organization 

The PRD1 virion consists of a linear double-stranded 
DNA genome surrounded by an envelope containing host 
lipids and viral proteins. The major coat protein forms 
a protective shell over this membrane punctuated only by 
minor vertex and cementing proteins. Cryo-electron micro¬ 
scopy and image reconstruction of the virion (13, 57) 
revealed an icosahedral particle with average dimensions 
vertex to vertex of 69.8 nm, edge to edge of 65.5 nm, and 
facet to facet of 63.7 nm (figure 13-2A, B, C). There are 
720 surface projections clustered into 240 capsomers occu¬ 
pying the hexavalent positions of a pseudo T = 25 lattice 
(figure 13-2A, B, G). The capsomers are trimers of the major 
coat protein, P3 (11, 13). Each facet of the capsid is formed 
from 12 copies of the P3 trimer. 

The PRD1 vertices are occupied by flexible spikes in 
the virion, seen clearly only in electron micrographs 
(figure 13-2 C), being averaged out in the icosahedral recon¬ 
structions because of their flexibility and nonstoichiometric 
association with the capsid (figure 13-2A, B, D). However, 
the base of the spike is clearly resolved as part of a star¬ 
shaped structure occupying the vertex (figure 13-2D). P SI 
mutants lack this structure along with the peripentonal 
trimers (55). Purified, recombinant P31 is a pentamer 
(33, 55) that has been modeled at low resolution from 
small-angle X-ray scattering data (63). When 5-fold symme¬ 
try was applied a very similar structure in shape and dimen¬ 
sions to the vertex base was revealed (compare figure 13-2D 
and E). The P31~ mutant also fails to assemble P5 and P2, 
but the exact spatial arrangement of the spike-complex 
proteins is not known (55). P5, isolated as a trimer from 
the virion (8, 19), has at least two separate domains. 
The C-terminal domain is the trimerization domain, the 
N-terminal domain, separated from the C-terminus by a 
glycine-rich stretch, is the site of attachment to the virion. 
The N-terminus is homologous to P31, and can disrupt P31 


pentamers in vitro (9, 19). So it is probable that P 31 forms 
the pentameric base of the spike (55) and interacts with 
the P5 N-terminus. P2, dependent on the P5 C-terminus for 
assembly (9, 34), is the receptor binding protein (27,46). Both 
recombinant P5 and P2 are elongated molecules (P2 is 15.5 
nm long, P5 is 27 nm long; 63, 69) and are thus good candi¬ 
dates for forming the shaft of the spike. The X-ray structure 
of P2 (69,70) revealed a club-shaped molecule with a pseudo 
(3-propeller head (figure 13-2F, red domain) and a long 
tail formed from an extended |3-sheet (figure 13-2F, yellow 
domain). The head is proposed to be the site of receptor 
binding, lying distal to the virus. 

The Major Coat Protein 

In 1999 the major coat protein structure was resolved (11), 
and was recently refined to 0.165 nm resolution (12). It 
revealed a hexagonally based trimer with three interlocking 
subunits, each with two eight-stranded (3-barrels normal to 
the viral capsid, (figure 13-2H, I). The (3-barrels have the 
same topology but no apparent sequence homology. Loops 
above the first (3-barrel (VI) are more extensive, rising above 
those of the second barrel (V2). This is defined as a tower and 
corresponds to the 720 surface protrusions seen in the elec¬ 
tron microscopic reconstructions (13; figure 13-2A, B, G, I). 
The trimer is stabilized by several interactions between the 
VI of one subunit and the V2 of the adjacent one. The FG2 
helix lies in the subunit interface (figure 13-2H) providing 
extensive hydrogen bonding for VI, V2, and the FG1 loop. In 
addition, the intertwining of the FG1 loops from adjacent 
monomers stabilizes the center of the trimer. A pseudo- 
atomic model of the capsid was built from the 
individual trimers by fitting them into density from 
electron microscopic reconstructions (11, 57). This demon¬ 
strated their neat interdigitation, which forms a closed shell 
over the surface of the membrane (figure 13-2G). It also 
implicated the N- and C-termini and the loop I1B2 in virion 
assembly as they all face the viral membrane (figure 13-21). 
This has been supported by mutational analysis of key resi¬ 
dues in the N-terminus which are thought to form an 
extended helix that reaches the membrane surface 
and stabilizes the virion during DNA packaging (13, 57, 58). 
The last few C-terminal residues also have an effect on 
assembly, supporting the hypothesis that these residues 
from adjacent trimers link together to help stabilize 
the center of the facets (58). 

The location of additional capsid cementing proteins 
(glue proteins) was hypothesized by studying the 
energy of interaction between trimers in the capsid (57). 
Mutational studies and difference imaging have identified 
one potential glue protein, P30, probably located between 
facets, and another, as yet unidentified protein, 
that radiates out from the inside of the capsid vertex, 
between the bases of the peripentonal trimers (54, 57, 58). 
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Figure 13-2 Structure of PRD1. Cryo-electron microscopy and image reconstruction of PRD1 at 1.4 nm resolution reveal 
the following: A: In central cross-section the organization of the virion into three layers: the outer protein capsid, the 
membrane bilayer, and the underlying concentric DNA rings. B: The surface representation of the icosahedral virion 
reconstruction viewed down a 3-fold axis of symmetry shows the pseudo T = 25 arrangement of the surface lattice. C: Spikes 
on the virion vertices that are sometimes evident in cryo-electron micrographs of the virion (arrow in C) are averaged out 
in the reconstruction, leaving only a small central bump as seen in the section (A) and on the surface representation (B), 
enlarged in (D). The vertex base (D). D: The vertex base is made of a P31 pentamer which is similar in shape and dimensions 
when E: modeled from SAXS data of the recombinant protein (63). F: A ribbon diagram of the atomic model of the 
monomeric receptor binding protein, P2 (69). The pseudo p-propeller head is the top domain, the extended p-sheet tail 
is the bottom domain. C: The atomic model of the trimeric major coat protein, P3, fits into four quasi-equivalent 
positions in the electron microscopic density (wire mesh from a 1.4 nm PRD1 P9~ mutant reconstruction). The P3 Q, chains 
are shown fitted into the wire mesh (57). The density of P3 is also outlined in white in the left half of the section in (A). 
H: A ribbon diagram of the P3 trimer as seen from the outside of the virion. One monomer is highlighted to show the two 
adjacent p-barrels. The FC1 loops lock adjacent monomers together, the FC2 helix lies in the subunit interface. I: A side 
view of the P3 trimer rotated 90° relative to (FI). The larger p-barrel forms all of the 720 towers seen in the 
reconstruction (A, B). The IIB2 loop and the N- and C-termini all face the viral membrane (12). Scale bar in (A) 50 nm, 
in (C) 100 nm, in (C) 5 nm. In (E) the pseudoatoms are 0.19 nm in radius. Figure (A) was prepared with SPIDER (25), 

(B) with opendx, (E) with rasmol (62), (F) with O (36), (F), (C) and (FI) with Molscript and Raster3D (38, 44). 

See thebacteriophages.org/frames_0130.htm for a color version of this figure. 
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A small increase in a-helical content has been identified 
by Raman spectroscopy between a shell containing 
180 trimers of P3 and 60 copies of P30, compared with 
trimers of P3 alone. This could be accounted for by the 
stabilization of the P3 subunit N- and C-terminal a-helical 
domains (54, 57, 58,65). 

Membrane and DNA 

Laser Raman spectroscopy of PRD1 has shown that the lipids 
in the membrane are in the liquid crystalline phase between 
5 and 50°C, and the membrane proteins are mainly a-helical 
(3, 10, 66). From the electron microscopic reconstructions 
(figure 13-2A) it is clear that the membrane is well ordered, 
follows the icosahedral outline of the capsid, and the leaflet 
centers are separated by approximately 2.9 nm. Many inter¬ 
actions occur between the membrane and both the capsid 
and the underlying DNA (13, 57, 58). The extensive interac¬ 
tions between the DNA and especially the phosphatidyl- 
ethanolamine moieties occur without affecting the B-form 
backbone conformation of the DNA (66). The average separa¬ 
tion of the concentric layers of DNA is approximately 2.5 nm, 
similar to that found for other bacteriophages and animal 
viruses (13, 57). 

The structural analysis of PRD1 received a huge boost 
recently when diffracting crystals of a P2~ mutant of the 
virion were obtained (7, 20). This ongoing structural deter¬ 
mination will provide not only the detailed organization 
of the capsid but also the first high-resolution description of 
a native membrane as part of one of the largest structures 
solved to date. 


Life Cycle 

PRD1 is a lytic phage that exploits the transcription and 
possibly some of the replication functions of its host. The 
host cell is selected by specific recognition of a receptor on 
the cell surface. After adsorption, the phage genome is 
injected into the cell cytosol leaving the capsid outside. 
After production of the phage components, both virus- and 
host-encoded factors assist in particle assembly. Host cell 
lysis releases some 500 progeny virions. For a schematic 
presentation of the PRD1 life cycle see figure 13-3. 


Receptor Recognition 

PRD1 belongs to the class of broad-host-range, donor- 
specific phages, which infect cells only when IncP-, IncN-, 
or IncW-type multiple drug resistance conjugative plasmids 
are present. Among the hosts are several opportunistic 
human pathogens such as E. coli, Salmonella enterica, and 
Pseudomonas aeruginosa. Plasmid functions in the phage 
life cycle are dispensable after the genome has entered the 


cell (23, 43). These plasmids, the best studied of which is 
the IncP plasmid RP4 (50), encode the phage receptor on 
the cell surface. Receptor saturation experiments revealed 
approximately 25 and 60 receptors evenly distributed on 
the surfaces of E. coli and S. enterica cells, respectively 
(37). The receptor structure is accessible beyond the lipo- 
polysaccharide layer on the cell surface since the length 
of the lipopolysaccharide chain does not affect PRD1 
propagation (37). 

The genes needed to support PRD1 entry in RP4-contain- 
ing cells localize into two plasmid regions, Tral and Tra2, 
which are responsible for conjugative transfer functions 
(39, 68). Selection of spontaneous PRD1- resistant cells 
revealed that 10 Tra2 region genes (trbB, -C, -D, -E, -F, -G, -H, 
-I, -J, and -L) and one (£raF) from the Tral region are essential 
for PRD1 sensitivity (31). The same set of genes is the mini¬ 
mal requirement for RP4 self-transfer between E. coli cells 
(32). Only 1% of the spontaneous PRD1 resistant mutants 
were weakly transfer-proficient, suggesting that the Tra2 
gene products interact with each other forming a multi¬ 
functional complex. The primary function of this apparatus 
is to establish an intimate cell-cell contact, the mating 
pair formation (Mpf), in bacterial conjugation. 

The RP4 Mpf components are proposed to form a channel 
or a pore, facilitating DNA transfer through the membranes 
into the recipient cell (40). Systematic searches of the Mpf 
proteins for secondary structure elements, which are typical 
for membrane proteins, suggest that these proteins are 
targeted to the cell envelope. Cell fractionation studies 
revealed that TrbE, TrbF, TrbI, TrbL, and TraF localize to 
the cytoplasmic membrane and gene products containing 
cleavable signal peptides are exported into the periplasm 
(TrbC, TrbG, and TrbJ) or to the outer membrane (TrbH) 
independently of any other RP4 function (30). The pilin 
protein of RP4 (TrbC) is cyclized by an RP4-encoded serine 
protease, TraF, and assembled into a conjugative pilus on the 
cell surface (24). In the presence of all Mpf components 
a trans-envelope structure bridging the inner and outer 
membranes is formed (30); see figure 13-3. This DNA trans¬ 
fer complex has been shown to increase cell envelope 
permeability, supporting its proposed role as a conductive 
channel (21). 

A single phage structural protein, P2, located at the 
vertices (figure 13-2), is responsible for PRD1 attachment to 
its host (46). P2 is a minor protein and its presence on the 
phage particle is dependent on the spike protein P5 and 
the penton, protein P31 (9, 55). The purified recombinant P2 
is a stable monomer, which binds to its receptor with 
high affinity and this binding prevents phage adsorption 
(27). It is also a stability factor that ensures the DNA injec¬ 
tion process does not start before the virus binds to its recep¬ 
tor (27). 

Each of the PRD1 vertices is a metastable structure (55) 
and possibly capable of DNA release. The injection vertex 
is likely to be determined by P2 binding to the receptor (27). 
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Figure 13-3 Schematic representation of the PRD1 life cycle. The vertex-associated spike complex (comprised of proteins 
P31, P5, and P2; see Fig. 13-2) binds to the IncP plasmid-encoded DNA transfer complex (upper left panel). The sequential 
steps of genome delivery are shown in the upper right panel (A-E). A: Receptor binding signals to the DNA delivery apparatus 
(containing at least proteins P7, P11, P14, PI 8, and P32) leading to considerable conformational changes in the vertex 
structure. B: The removal of the spike complex creates an opening in the vertex, which enables an appendage to protrude 
(protein P11) that penetrates the other membrane C: The lytic transglycosylase (P7) is thought to assist in genome entry by 
locally degrading the peptidoglycan layer D: The appendage formed extends some 35 nm (the thickness of the cell 
envelope), penetrating the inner membrane E: The DNA translocation in dependent on active membrane tube formation and 
reduction of the membrane vesicle volume assisted by at least proteins P14, P18, and P32. After DNA injection, protein- 
primed genome replication, transcription, and translation take place. Upon translation, the major capsid protein P3 
accumulates as trimers in the host cytosol whereas those phage proteins associated with the virus membrane in the mature 
virion are addressed to the host cell inner membrane. Particle assembly is assisted by the host GroEL/ES complex and three 
phage-encoded assembly factors: the membrane-bound P10, and the soluble PI 7 and P33. Upon particle formation the 
phage-specific membrane is pinched off from the host inner membrane by the action of the scaffold protein P10 and the 
major coat protein P3. The minor capsid protein P30 enables the formation of an empty prohead by stabilizing the P3 
interactions. The phage DNA is packaged into proheads by the packaging ATPase P9 through a unique vertex and mature 
virions are released upon lysis of the host cell caused by the phage-encoded muramidase PI 5 and holin protein P35. 

See thebacteriophages.org/frames_0130.htm for a color version of this figure. 


The association of P2 with the receptor activates, possibly 
by P2 removal, the injection process. This leads to irreversi¬ 
ble binding. Both empty and DNA-containing particles are 
equally tightly bound to cells, indicating that DNA injec¬ 
tion is not a prerequisite for this tight interaction (27). 
Irreversible binding is not due to the receptor binding 
protein P2 but to an as yet unidentified component (27). 
After initial binding, PRD1 seems to be relatively indepen¬ 
dent in translocating the genome into the host cell cytosol 
(see below) and at present we do not have data about the 
further involvement of the IncP plasmid-encoded DNA 
transfer complex in the PRD1 entry process. 

DNA Entry 

Isolation and analysis of PRD1 mutants have resulted in 
the identification of eight phage-specific structural proteins 


essential for infectivity (9, 29, 45, 46, 51, 55). In addition 
to the spike-complex proteins (P2, P5, and P31) needed 
for adsorption, proteins Pll, P14, P18, and P32 are involved 
in DNA delivery, since particles devoid of these proteins 
bind irreversibly to host cells but are noninfectious. Mutant 
particles missing protein P7 (a lytic transglycosylase) 
are infectious but the DNA entry process is delayed (51). 

Efflux of cytosolic potassium has been shown to take 
place concomitantly with phage DNA transport (41). Electro¬ 
chemical analysis of infected cells showed that PRD1 
infection induces a transient efflux of K + and ATP from 
the cytosol (22). Wild-type virions also increased the per¬ 
meability of the host outer membrane to lipophilic com¬ 
pounds, while adsorption of empty particles did not cause 
any changes in cell envelope permeability. Electrochemical 
measurements of cells infected with mutant particles 
missing proteins P7, Pll, P14, P18, or P32 revealed that all 
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particles, except those devoid of Pll, increased the perme¬ 
ability of both the outer membrane and the cytoplasmic 
membrane, suggesting that Pll is temporally the first 
DNA delivery protein (28). Thus all other DNA delivery 
mutants are able to cause similar membrane-associated 
effects as the wild-type particles but DNA translocation 
does not occur. 

It has been shown previously that the PRD1 membrane 
can undergo a structural transformation from a spherical 
vesicle to a tubular form (1, 13, 42). Empty particles with 
these tubes, extending from one vertex, occur with a fre¬ 
quency of about 10% in wild-type preparations (55). This 
tube formation has been suggested to be involved in PRD1 
DNA injection (1, 42). Interestingly, it was observed that 
particles devoid of the receptor binding protein, P2, sponta¬ 
neously release their DNA with concomitant formation of 
the tail-like membrane structure (27). Tube formation plays 
an important role in phage DNA transport, as absence of 
any of the integral viral membrane proteins P14, P18, or 
P32 abolishes tube formation (28, 29). Since the phenotype 
of the P14A P18A or P32~ particles is similar and compar¬ 
able to that of wild-type virions, except for the compromised 
tube formation, these proteins probably form a viral 
membrane-associated DNA translocation machinery. 

Cenome Replication 

The genome of PRD1 is a linear double-stranded DNA mole¬ 
cule with proteins covalently attached to both 5' termini 
and 1T0 bp terminal repeats (6, 59). PRD1 replicates its 
DNA by means of a protein-primed replication mechanism 
similarly to other viruses with linear double-stranded 
DNA genomes, including adenovirus, phage Cp-T, and the 
4>29-type phages (reviewed in 56). Development of a mini¬ 
mal in vitro replication system using purified components 
has enabled detailed characterization of the replication 
mechanism (6T). PRDT DNA replication starts with the 
formation of a covalent bond between the primer protein, 
P8, and the 5' terminal nucleotide, dGMP, in a reaction cata¬ 
lyzed by the phage DNA polymerase, PI (6, 60). Similarly 
to other DNA polymerases, PI is activated by Mg 2+ , but also 
by Mn 2+ , which significantly stimulates the initiation reac¬ 
tion (T6). The minimal origin of replication resides in the 
20 first terminal base pairs of both genome ends and the 
fourth base from the 3' end of the template directs, by base 
complementation, the dNMP to be linked to the terminal 
protein (15, 71). The 3' end DNA sequence is maintained by 
a sliding-back mechanism where the polymerase complex 
dissociates from the template, moves back, and binds again 
in a stepwise fashion. 

Subsequent to initiation, elongation of the initiation 
complex by the same DNA polymerase takes place, in a 
processive manner, resulting in the formation of full-length 
daughter DNA molecules (16, 61). The PRD1 DNA poly¬ 
merase possesses, in addition to protein-primed initiation 


and DNA polymerization activities, a 3'—5' exonuclease 
activity specific for single-stranded DNA (16, 61). This 
probably accounts for the proofreading capacity. Two phage- 
encoded DNA binding proteins, P12 and P19, are involved 
in replication in vivo, although they are dispensable in 
reactions in vitro (48, 49). Both proteins preferentially 
bind to single-stranded DNA, protecting it from nucleases 
(48,49). 

Particle Assembly 

By radioactive and immunolabeling it has been shown 
that approximately 15 minutes after infection the major 
capsid protein P3 and the spike-complex proteins P2, P5, 
and P31 are found soluble in the host cell cytosol, whereas 
the phage-encoded membrane proteins (e.g., P7, Pll, P14, 
and P18) are addressed to the host cell cytoplasmic 
membrane. Correct folding of the soluble proteins and the 
assembly of a number of viral membrane proteins are 
dependent on the host GroEL/ES chaperonins (35). Upon 
assembly, a virus-specific patch from the host cytoplasmic 
membrane is translocated into the forming prohead using 
the membrane-bound scaffolding protein P10 (54). In 
addition, two small phage-encoded soluble factors are indi¬ 
cated in the assembly process: the tetrameric protein P17 
(17) and possibly P33 (7). 

Recently it has been shown that the capsid of PRD1 
contains, in addition to P3, a minor component P30 (about 
3 % of the total mass of the capsid), which has been classified 
as a glue protein in analogy to adenovirus glue proteins 
(54). This 9 kDa minor protein plays a key role in the capsid 
assembly process as infection with the P30~ mutant results 
in the formation of a virus-specific membrane vesicle with 
only 5-10% of P3 present. The most abundant protein 
on the vesicle is the scaffold protein P10 (54). Thus the vesicle 
probably represents an early intermediate in the assembly 
pathway. 

The correct assembly results in a prohead with a capsid 
enclosing the membrane rich in phage-specific proteins. 
The linear DNA is packaged into the prohead by the pack¬ 
aging ATPase P9, which is a structural component of the 
mature virion. In addition to P9, two small membrane 
proteins, P20 and P22 (46), are essential for the stable pack¬ 
aging of the phage DNA, indicating that the portal structure, 
residing in a single vertex, containing proteins P6, P9, 
P20, and P22 is presumably connected to the viral mem¬ 
brane (26, 64). 

Cell Lysis 

At the end of the infection cycle, the newly synthesized 
progeny virions are released via host cell lysis. Two genes, 
XV and A, involved in this step have been identified by the 
analysis of phage nonsense mutants (45), suggesting that 
a two-component, holin-endolysin system (see chapter 10) 
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also operates in phage PRD1. The product of gene XV, protein 
PIS, has been shown to be a soluble |3-l,4-N-acetylmurami- 
dase that effectively degrades the peptidoglycan of the Gram¬ 
negative cell causing host cell lysis (18). Recently, it was 
shown that this lytic enzyme is also a phage structural 
component associated with the membrane via protein- 
protein interactions (53). The PRD1 particle carries another 
muramidase, protein P 7, which has a lytic transglyco- 
sylase activity assisting in genome entry (51). The presence 
of two lytic activities may reflect the broad host range of 
PRD1. The Tectiviridae family contains viruses infecting 
both Gram-negative and Gram-positive cells. Interestingly, 
in zymograms the P15 muramidase was able to degrade 
the peptidoglycan isolated from a Gram-positive bacterium 
Micrococcus lysodeikticus, while P7 was active only against 
the Gram-negative cell wall (51). 

In addition to lytic enzymes, bacteriophages quite often 
encode helper protein factors (holins) that facilitate the 
access of lytic enzymes to the susceptible bond in the cell 
wall, and control the timing of lysis (67). Such a factor has 
also been reported for PRD1 (45). This holin gene was 
recently identified and assigned as gene XXXV (52). 

Conclusions 

The emphasis in PRD1 research is becoming more holistic. 
Active areas are virus-host cell interactions, detailed link¬ 
age of atomic resolution structure to function, and virus 
evolution. The possibility of exploring this complex system 
in such detail arises because of the fundamental knowledge 
that has already accrued. 

Note added in proof: The X-ray structure of the virion and 
the C-terminal domains of protein PS are now available 
(Abrescia et al, 2004, Nature 432, 68-74; Cockburn et al., 
2004, 432, 122-125; Merckel et al., 2005, Mol. cell, April 
issue). 
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B acteriophage PM2 is the only described member of 
the Corticoviridae family (1). It was isolated from sea¬ 
water collected from a polluted bay off the coast of Chile 
(13). The original host was a common marine bacterium 
Pseudoalteromonas espejiana BAL-31 (originally Pseudomo¬ 
nas BAL-31; 14, 18), which is the source of the DNA exonu¬ 
clease BAL-31. Alternatively, Pseudoalteromonas sp. ER72M2 
obtained from the East River, New York, by Leonard Mindich 
(26) can be used as a host for PM2. 

PM2 is the first bacteriophage in which the presence 
of lipids in the virion was firmly demonstrated (7, 15). The 
mass of the virion (~4.5 x 10 7 Da) is distributed among 
nucleic acid (14%), lipid (14%), and protein (72%) constitu¬ 
ents (8, 9). The sedimentation coefficient (S 2 o, w ) of the parti¬ 
cle is 293 S and buoyant density in sucrose and cesium 
chloride is 1.26 g/cm J and 1.28 g/cm J , respectively (8, 26). 
The stability of the virion is dependent on sodium and 
calcium ions and the virion equilibrated in sucrose is 
inactive (26). The viral membrane is located internally 
(7, 20) and it forms, together with the phage-encoded 
membrane-associated proteins and the phage DNA, a lipid 
core particle (25). An icosahedrally ordered capsid, about 
60 nm in diameter, surrounds the lipid core. Thus the over¬ 
all architecture of PM2 resembles that of bacteriophage 
PRD1, a tectivirus, which has a membrane vesicle surround¬ 
ing the linear double-stranded DNA genome and an outer 
icosahedral capsid of approximately ~65 nm in diameter 
(see chapter 13). 

Life Cycle and Virion Components 

PM2 binds to a cell surface receptor. In rich medium the 
infected cells lyse 60-70 minutes after infection releasing 
some 300 virions (26). The replication and particle assembly 
seem to take place in association with the host plasma 
membrane. Virus-size membrane vesicles, considered to be 


assembly intermediates, have been depicted lined up along 
the host plasma membrane (11). The viral lipids are derived 
from the host plasma membrane (15) and the lipid composi¬ 
tion (approximately 64% phosphatidyl glycerol, 27% phos¬ 
phatidyl ethanol amine, 8% neutral lipids, and small 
amounts of acyl phosphatidyl glycerol; 7, 33) deviates from 
that of the host bacterium (5). This reflects either a selective 
assembly mechanism or a specific location of assembly. 

Originally four structural proteins, I-IV were discovered 
in the PM2 virion (12), but a higher number was also postu¬ 
lated (6). Recently, the development of better purification 
and additional disruption methods for the virion (26) as 
well as information obtained from the genome sequence 
(30) enabled the identification of ten structural proteins. 
They were designated P1-P10 (table 14-1), P1-P4 corre¬ 
sponding to the earlier described proteins I-IV The outer 
capsid of PM2 is composed of the major capsid protein P2 
and the host cell attachment protein PI, which is located 
at the 5-fold vertices (6, 21, 31). These two proteins corre¬ 
spond to 40-50% of the protein mass of the virion. PI can 
be quantitatively removed from the virion in a monomeric 
form by lowering the ionic strength (25). Freezing and 
thawing or removal of calcium ions releases both PI and P2 
quantitatively (25). Protein P2 is released as a trimer. It has 
been proposed that P2 binds calcium (31), and that calcium 
ions are essential in the final assembly process during PM2 
infection (32). Proteins P3-P8 have a predicted transmem¬ 
brane region and protein P4 has been shown to bind DNA 
(28). Figure 14-1 schematically depicts the architecture of 
the PM2 virion. 

The Genome 

The double-stranded circular genome of PM2 is one of the 
most tightly supercoiled DNA molecules known (16). For 
this reason it has been widely used as a substrate in a 
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Table 14-1 PM2 Genes, Open Reading Frames (ORFs), and Corresponding Proteins 


Gene/ORF a 

Coordinates in 

PM2 Genome b 

Protein 

Mass l<Da c 

Description 11 

XV 

550..77 e 

PI 5 

18.1 

Transcription factor (N) 

ORF b 

755..663 e 


5.5 


XVI 

1128..850 e 

PI 6 

10.3 

Transcription factor (N) 

ORF d 

1359..1583 


8.5 


ORF e 

1580..1822 


8.9 


XII 

1779..3719 

PI 2 

73.4 

Replication initiation protein (N) 

XIII 

3716..3910 

PI 3 

7.2 

Transcription factor (N) 

XIV 

3907..4212 

P14 

11.0 

Transcription factor (N) 

ORF h 

4212..4643 


15.7 


IX 

4615..5271 

P9 

24.7 

ATP binding site 

VII 

5406..5510 

P7 

3.6 

(M) 

II 

5523-6332 

P2 

30.2 

Major capsid protein 

III 

6345-6659 

P3 

10.8 

(M) 

IV 

6659-6781 

P4 

4.4 

(M) 

VIII 

6781-7008 

P8 

7.3 

(M) 

X 

7079-7918 

P10 

29.0 


1 

7918-8925 

PI 

37.5 

Spike protein 

V 

8925-9407 

P5 

17.9 

(M) 

VI 

9400-9783 

P6 

14.3 

(M) 

ORF k 

9780-9941 


6.0 


ORF 1 

9922-10077 


5.7 



3 ORFs shown to code for functional proteins are classified as genes and are given a Roman numeral. 
b GenBank Accession No. AF155037. 

c The mass does not include the initial methionine if not present in the mature protein. 

d N, nonstructural protein; M, integral membrane protein based on transmembrane helix prediction and location in the viral membrane. 
e The gene/ORF is transcribed in the opposite direction to that of the rest of the genes or ORFs. 



~-45 nm 


,_ ^55 nm _, 

Figure 14-1 A schematic presentation of the PM2 virion 
architecture. 

variety of enzymatic assays. The nucleotide sequence of 
the 10,079 bp long PM2 genome has been determined and 
revealed 21 putative genes (30). So far, no mutants are avail¬ 
able for PM2 and the nomenclature for the genes and 
proteins is the following: a Roman numeral has been 
assigned to a gene when it has been confirmed to encode 
a protein: in other cases it is classified as an open read¬ 
ing frame (ORF) with a lower-case letter. The protein 


number (Arabic) follows the gene number with the prefix P 
(e.g., gene II encodes protein P2; table 14-1, figure 14-2: 
26, 30). 

Promoter mapping by primer extension has revealed 
putative messenger RNA start sites (29). PM2 genes are 
arranged in three operons—one immediate early, one early, 
and one late (figure 14-2)—which are expressed in a timely 
fashion during virus infection. Only two of these promoters, 
P 1207 and P 1193 , which guide the transcription of the leftward 
immediate early and the rightward early operon, respec¬ 
tively (figure 14-2), are functional in E. coli (29). However, 
the late operon, P 5321 can be activated by two phage-encoded 
transcription factors P13 and P14 (29). Interestingly, P14 
has sequence similarity to theTFIIS-type general eukaryotic 
transcription factors resembling most closely to those of 
the archaeal organisms Thermococcus celer (23) and Sulfolo- 
bus acidocaldaricus (27). 

Protein P12 has also been identified on the basis of 
sequence similarity. It has conserved sequence motifs 
common to superfamily I replication initiation proteins (30). 
This superfamily consists of A proteins of certain bacterio¬ 
phages, such as c()X174 and G4, and initiation proteins of 
cyanobacterial and archaeal plasmids (22). Electron micro¬ 
scopy of PM2 DNA has revealed replication intermedi¬ 
ates consisting of double-stranded circular molecules with 
growing tails no longer than the length of the genome (17). 
Based on this it was proposed that the replication of PM2 
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C I-1 

Figure 14-2 The genome of PM2 in a linear form. The circular genome is cut from a unigue EcoRII site and the nucleotide 
coordinates are defined in (30). A: Promoters (P) and putative transcription terminators are presented. The genome is 
organized into three operons. Promoter P 1207 is responsible for the transcription of the immediate early operon, which is 
transcribed in the opposite direction to the rest of the operons. The early operon is under the control of Pi 193 . The late 
promoter, P 5321 , promoting the expression of the virion structural components is activated by two phage-encoded 
transcription factors PI 3 and PI 4. B: The genes are designated with Roman numerals and the ORFs with letters. For functions 
of the genome-encoded proteins see table 14-1. C: Thel ,2-kb-long region similar to the Pseudoalteromonas sp. plasmid 
pAS28. 


DNA utilizes the rolling circle mechanism (10). The sequence 
similarity of protein P12 to the replication initiation pro¬ 
teins promoting rolling circle replication strongly supports 
this idea. The two early operons contain several potential 
genes encoding factors involved in DNA replication. This is 
supported by the fact that a 1.2 kb fragment from the 
predicted PM2 early region is homologous to the mainte¬ 
nance region of the Pseudoalteromonas plasmid pAS28 
(figure 14-2; 24,30). 

PM2 has an overall architecture that resembles that of 
PRD1, a Tectivirus. However, the detailed capsid architec¬ 
ture, genome organization, and replication mechanism devi¬ 
ate from those of PRD1. It appears that PM2 does not belong 
to the same lineage of viruses as PRD1 and adenovirus (2-4). 
When more structural information has accumulated, it 
remains to be seen whether PM2 merges to a present virus 
lineage or forms one of its own. 
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Single-Stranded RNA Phages 
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NINA TSAREVA 


T he single-stranded RNA coliphages were discovered 
by Tim Loeb and Norton Zinder in 1961 as the result of 
a search for phages whose infection cycle depends on E. coli 
F-pili, normally used for bacterial conjugation. Loeb and 
Zinder plated filtered samples of raw New York City sewage 
on E. coli strains and screened for phages that would produce 
plaques on male (F+) but not female (F—) bacteria. The first 
isolate, named fl, turned out to be a filamentous phage with 
a single-stranded DNA genome; the second isolate, named 
f2, was an RNA-containing phage. Since f2 made clear 
plaques, Loeb and Zinder decided to concentrate their work 
on this phage (107). 

f2 and close relatives such as MS2 and R17 represented 
a superb source of pure messenger RNA that could be 
produced in large amounts: up to 10 13 phage particles per 
milliliter are made within a few hours after infection of 
bacterial cultures, and the phages can be easily purified. 
RNA phages also attracted attention because scientists 
were intrigued about how their RNA genomes were 
replicated. 

Habitat 

Single-stranded RNA coliphages are found wherever E. coli 
lives, for example in the intestinal tract of man and other 
animals. Studies have shown that sewage samples world¬ 
wide contain from 10 2 up to 10 / RNA phage particles per 
milliliter (26). For humans RNA phages are harmless crea¬ 
tures. Several other Gram-negative bacteria can propagate 
their own RNA phages. 

Two Major Classes of RNA Phages 

The single-stranded RNA phages form the family Leviviridae. 
Based on serological cross-reactions, genetic map, and 
RNA size this family is divided into two genera: Levivirus 
and the Allolevivirus (26, 56). Leviviruses (previously called 


group A) have RNAs approximately 3500 nucleotides in 
length, and code for four proteins: maturation (or A) 
protein, coat protein, lysis protein, and the RNA replicase 
(figure 15-1). The open reading frame for the lysis protein 
overlaps with those of the coat and replicase proteins. MS2 
phage of this genus will be used to discuss translational 
control in this chapter. 

Phages in the Allolevivirus genus (group B) have a longer 
genome of approximately 4200 nucleotides. The difference 
comes mostly from a region on the genome encoding the 
read-through protein (previously called Al). Read-through 
arises when the UGA stop codon at nucleotide 1742 (0(3) 
that terminates the normal coat protein gene is occasionally 
misread as a tryptophan codon (UGG). This occurs with a 
probability of about 6%. About 15 copies of the read-through 
protein are incorporated into the capsid (8): their precise 
function is unknown but the protein, together with matura¬ 
tion (A), is needed for the virion to be infectious. These allo- 
leviviruses have no distinct lysis protein. Lysis is carried 
out by the maturation protein (37, 101). Phage 0(3 has been 
the most intensively studied allolevivirus. Its RNA poly¬ 
merase can be easily purified, while those of the leviviruses 
are much more difficult to work with, and so most of the 
work on RNA replication has been done with Q(3. 

Each Genus Contains Two Species 

Based on serological properties and nucleotide sequence 
each genus is divided into two species (26, 96). MS2 is a 
species I representative. Other sequenced strains within 
this species are f2, R17, M12, JP501 and fr* (96). Species II 
contains GA, KU1, JP34, and many more unsequenced 

* Although clearly a species I phage, fr differs in many RNA 
structural aspects from the other strains in species I. This 
may relate to the fact that it was not isolated from a sewage 
plant but from an isolated dung-hill (by Hoffman-Berling) 
(C. Biebricher, personal communication). 
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Figure 15-1 Genetic maps of four representative single-stranded RNA phages. Phages with a read-through protein are 
classified as alloleviviruses. ORF1 in AP 205 is thought to encode a lysis function. The map of AP205 shown here is a 
correction of that published in (43). The difference is due to an additional C-residue found lately at position 460. 


strains. Species I has a longer and more elaborate 3' UTR (2) 
and a small insertion in the replicase gene. 

The alloleviviruses are also composed of two species. 
QP, ST, VIC MX1, and Mil are fully or partly sequenced (6) 
and belong to species III. NL95, SP, TW28, and ID2 belong 
to species IV (26). The RNA of this species is longer than 
that of III, mainly due to an insertion in the maturation 
protein gene. 

Virion Structure 

Virions contain 180 copies (90 dimers) of the coat protein 
arranged in aT = 3 icosahedral shell that encloses the RNA 
(figure 15-2A). A single copy of the maturation protein is 
bound to the encapsidated RNA. As mentioned above 
alloleviviruses contain in addition about 15 copies of the 
read-through protein. Encapsidated RNA is resistant to 
ribonuclease treatment. However, RNA in virions made 
with defective or missing maturation protein is sensitive to 
RNase. The capsid structures of QP, GA, PP7, and MS2 have 
been solved by X-ray diffraction (28, 84, 85, 88). 

Natural and Artificial Infection 

F-pili are the mating organelles of E. coli and other 
bacteria. They enable male bacteria to transfer a partial 


single-stranded copy of their chromosome to females, 
which do not possess pili. F-pili are made from a single 
protein polymerized into a long (1-2 pm), ribbon-like struc¬ 
ture that protrudes from the cell. RNA phages do not have 
the specialized tail assemblies that are used by many DNA 
phages to inject their genomes. These phages instead subvert 
the F-pili to maneuver their single-stranded RNA genomes 
into the cell. Virions attach to the side of the pilus via their 
maturation protein (figure 15-2B). Upon contact with the 
pilus, the maturation protein is cleaved into two fragments. 
This releases the RNA from the virion, and it becomes sensi¬ 
tive to RNase degradation. How the RNA then gets inside 
the cell is not known. One possibility is that the pilus with 
the attached RNA retracts into the cell, dragging the RNA 
with it (63,64). 

Apart from the above natural way, there are several 
artificial procedures to infect the bacterial cell. In one, the 
outer membrane and the murein layer of the bacterium 
(F + or F _ ) are removed by a controlled lysozyme treatment. 
The remaining spheroplast, held together by the inner 
membrane in isotonic medium, can be infected now with 
naked (+) or (—) strand RNA. If (—) strand is used, the 
bacterium must express replicase to obtain infection (75). 
It is also possible to get naked RNA into intact bacterial 
cells by electroporation. Alternatively, intact host cells can 
be transformed with a plasmid carrying the complete cDNA 
of an RNA phage. Transformants produce phage sponta¬ 
neously (58, 83). The above procedures become important 
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Figure 15-2 Virus representations. A: Schematic representation of a Levivirus. Many of the RNA hairpin loops are believed to 
be in contact with the coat protein dimers. The position of the maturation protein in the drawing is arbitrary. The capsid 
shell has a thickness of 2 nm. Alloleviviruses contain an additional approximately 15 read-through proteins in their virions. 
B: Escherichia coli bacterium with F-pili to which many MS2 phages have attached (arrows). Note that a single phage suffices 
for successful infection. Courtesy of A. B. Jacobson. 


when one wants to infect cells with phage mutants prepared 
in vitro. Of the artificial methods, infection via plasmid- 
borne cDNA is by far the most efficient procedure. 

RNA phages are among the smallest autonomous 
viruses known. Their task upon infecting a cell is very 
straightforward: make their proteins, replicate their RNA, 
assemble progeny phage particles, and leave the cell. 
However, RNA translation and RNA replication turn out to 
be carefully regulated, both by RNA secondary structure 
and by binding of coat, polymerase, and maybe maturation 
protein to the RNA at specific sites. Unraveling the mecha¬ 
nisms by which these processes are controlled has been a 
fascinating puzzle. 

Replication Versus Translation: 

Competition for the Same RNA Template 

Once inside the cell, phage RNA begins to function as 
a messenger RNA for the synthesis of phage proteins. This 
turns out to be a highly regulated process, for two reasons. 
First, different amounts of each protein are needed. For every 
180 copies of the coat protein, the phage needs only 1 copy 
of the maturation protein to construct virions, and a few 
copies of the lysis protein to leave the cell. Also, since a 
single replicase protein can make multiple copies of RNA, 
far fewer replicase proteins are needed than coat proteins. 
Second, replication and translation of the same RNA mole¬ 
cule can lead to problems. Replication starts at the 3' end of 
the RNA and proceeds toward the 5' end. Translation, on the 
other hand, moves in the opposite sense on the RNA. If the 
phage had not made the proper arrangements, the polymer¬ 
ase and the translating ribosome would meet somewhere 


on the RNA and sit facing each other forever. Thus a mole¬ 
cule of phage RNA that begins to be translated must not be 
allowed to begin replication, and vice versa. 

Access of ribosomes to the start sites of phage genes is 
strongly restricted. In fact, on intact full-length RNA only 
the coat gene is able to bind ribosomes directly. Translation 
of the lysis and replicase genes only begins once the coat 
gene has engaged ribosomes and is being translated. Once 
replicase has been made, it assembles with some host 
proteins (see below) to form the active polymerase (active 
enzyme complex of which replicase is a subunit) which 
wants to begin copying the same RNA molecule that is 
being translated. The switch from translation to replication 
works as follows. Although polymerase starts transcrib¬ 
ing from the 3' end of the phage RNA, it binds to the RNA at 
two internal positions, called S and M-sites. One of these, the 
S-site, overlaps the start of the coat gene (see figure 15-5). 
As a result there is competition between polymerase and 
ribosomes for this site. If the polymerase arrives first, there 
will be no new translation, allowing polymerase to copy the 
RNA with no oncoming traffic. On the other hand, if the 
ribosome binds first to the coat protein gene, polymerase 
will not be able to bind to its template, and therefore the 
ribosome is free to complete its voyage unhindered (92, 98) 
(see below for further details). 

Mechanism of Translational Coupling 
Between Coat, Lysis, and Replicase Genes 

How can the access of ribosomes to the start sites of the 
lysis and replicase genes be made dependent on translation 
of the coat gene? The general answer is: by changing RNA 
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secondary structures. To understand this it is useful to 
consider the forces that drive the binding of ribosomes to 
translational start regions on prokaryotic messenger RNAs. 
There are at least three contributors to this binding energy: 
(i) base complementarity between 16S ribosomal RNA and 
the Shine-Dalgarno sequence just upstream of the start 
codon on the messenger RNA; (ii) interaction of the antico¬ 
don on the initiator fmet-tRNA with the AUG start codon on 
the messenger RNA: and (iii) binding of ribosomal protein SI 
to pyrimidine-rich sequences frequently found upstream of 
the Shine-Dalgarno sequence (81). If a strong pre-existing 
secondary structure in messenger RNA prevents binding 
of one or more of the above three components, there will 
be no ribosome binding and therefore no initiation of protein 
synthesis (24). 

Experiments have shown that if the start codon of the 
coat gene is deleted or mutated, preventing ribosomes from 
translating this gene, neither lysis nor replicase protein is 
synthesized (9, 25, 69). This is because the beginning of 


the lysis and replicase genes lie within RNA secondary 
structures that are too stable to allow ribosome binding 
(figure 15-3). 

Control of Replicase Cene Translation 

Given that a ribosome contacts messenger RNA over a 
stretch of about nucleotides 20 upstream and 15 nt down¬ 
stream of the initiator AUG, it is clear that three regions of 
secondary structure around the replicase start site 
(figure 15-3) could contribute to impeding the ribosome 
from binding to that site. These are first the long-distance 
interaction, MJ, second, the operator hairpin containing 
the AUG itself, and finally stem R32. When base-pairing 
at stem MJ is abolished by the introduction of mismatches 
or by deleting the sequence 1427-1434, there is a large 
increase in replicase synthesis, which is independent of 
translation of the coat gene (9, 93). Apparently, the remain¬ 
ing operator and R32 hairpin structures are together not 
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strong enough to block the entry of ribosomes. The coupling 
then works like this. Every time a ribosome reads the 
coat gene it disrupts base-pairing at stem MJ. Once this 
happens, other ribosomes can bind to the replicase start site 
and initiate translation there, even though this site is 
some 340 nt downstream from the position where the ribo¬ 
some is translating the coat protein gene. Further support 
for this model comes from the finding that introduction of 
translational stop codons into the coat protein gene 
upstream of nt 1427 inactivates translation of the replicase 
gene (the ribosome never gets to stem MJ). However, stop 
codons placed beyond nt 1434 allow replicase synthesis to 
proceed as efficiently as it does in wild-type RNA. Further¬ 
more, the more frequently the coat-protein gene is trans¬ 
lated, the more replicase is produced (S. H. E. van den Worm 
and J. van Duin, unpublished results). Coat-replicase coupl¬ 
ing also exists in QP RNA, but has not been studied in great 
detail. 

There is a second level of control of translation of the 
replicase gene. When the concentration of coat protein 
becomes sufficiently high in the cell, dimers bind to the 
operator hairpin, precluding further translational starts 
of the replicase gene by excluding ribosome binding. 
This protein-RNA interaction is maintained in the intact 
virion and has been studied exhaustively (29, 89, 90, 102, 
103). It is also believed to stimulate capsid formation by 
serving as a nucleation site for further addition of coat 
dimers. 

Control of Lysis Gene Translation 

Independent access of ribosomes to the lysis gene start 
site (nt 1678) is precluded by the “lysis hairpin” (figure 15-3). 
In the absence of coat gene translation, the lysis gene is 
not expressed unless the lysis hairpin has been destabilized 
by disruption of base pairs (10). Surprisingly, activation of 
the lysis gene is not directly coupled to opening of this hair¬ 
pin by ribosome movement over the overlapping coat gene 
(as with the replicase gene). Instead, translation starting 
at the lysis gene depends on termination of translation at 
the end of the coat gene (nt 1725). If this UAA stop codon 
(and the UAG stop codon that immediately follows it) 
are mutated so that termination does not take place until 
nucleotide 1749 (UGA), there is no expression of the lysis 
gene. On the other hand, if stop codons in the coat gene 
are introduced by mutagenesis between nt 1678 and 
1725 or even not too far upstream of the lysis start, lysis 
protein is made. In fact, the closer the coat gene stop codon 
is placed to the lysis start codon, the more lysis protein is 
made (10). These results suggest that ribosomes that have 
reached the coat stop codon and finished translation can 
reinitiate at the lysis start codon after randomly drift¬ 
ing a short distance from their site of termination. From the 
ratio in which coat and lysis proteins are synthesized by 
phage MS2, it follows that the probability to back up to 


the lysis start and successfully reinitiate is only 5% in the 
wild-type situation. 

This “scanning" model was further tested by introducing 
an additional start codon a short distance downstream 
from the authentic lysis start. Now, ribosomes used the new 
start codon in place of the authentic one, as it was closest 
to the termination triplet and thus formed a barrier that 
prevented ribosomes from reaching the authentic start site. 
If, however, coat gene termination was engineered upstream 
of the lysis start codon, the authentic initiation site was 
again used, because now this one was again closest to the 
termination site (1). Independent support for the scanning 
model comes from in vitro experiments with short messen¬ 
ger RNAs containing only 5 codons. Here, the ribosome 
shuffles back and forth between the start and stop codons, 
making multiple copies of a pentapeptide without ever leav¬ 
ing the template (65). Reinitiation is a commonly used 
mechanism to translate the distal reading frames in polycis- 
tronic messenger RNA. It is not known what determines 
the efficiency of these restarts. Ribosomes are released from 
the messenger RNA if no reinitiation sites are nearby. 

Control of Maturation Protein Synthesis 

In figure 15-4B the equilibrium secondary structure of the 
5' untranslated region (UTR) of MS2 RNA is shown. A 
strong stem-loop at the 5' end is followed by a “cloverleaf 
structure”: three hairpins enclosed by a long-range interac¬ 
tion, formed here by the Shine-Dalgarno (SD) sequence of 
the maturation gene, base-paired to an upstream comple¬ 
mentary sequence (UCS). This long-distance interaction 
effectively prevents ribosome binding (30). How then is 
the maturation protein made? The answer is that ribosomes 
can only bind to the maturation start site on RNA chains 
that are in the process of being synthesized and have not 
yet reached the equilibrium folding. 

Consider a growing (+) strand RNA being made on a (—) 
strand RNA template. By the time nucleotide 123 has 
emerged from the polymerase complex, all nucleotides 
needed to build the cloverleaf shown in figure 15-4B are pre¬ 
sent. However, translation cannot yet start because stable 
ribosome binding needs the start codon and sequences 
downstream of it, up to about nucleotide 140-145. It is 
known that small RNAs such as tRNA (~75 nt) fold up 
correctly within milliseconds (from the denatured state). 
Phage RNA polymerases are slow and incorporate approxi¬ 
mately 35 nucleotides per second into a growing chain. 
Thus, one expects the cloverleaf structure to be formed 
long before the ribosome binding site is made, as addition 
of a further 20 nucleotides beyond nt 123 would take at 
least about 500 ms. However, we found that the folding 
of the 5' untranslated region of MS 2 RNA to its final equi¬ 
librium is unusually slow: it takes minutes to reform upon 
its denaturation (67). Generally speaking, such a delay can 



180 PART III: CUBIC AND FILAMENTOUS PHAGES 


u 
c , 


5’ hairpin 


G ^ 
/ U 


U 


MS2 


Maturation (A) 


UCS SD 
5’UTR 


Coat 


Replicase 


Lysis | 


J U yUAC CGU AAUCG A < G CUG 
U A Au4u(;yUUA4i GA 
60 1 
WEST 70 


U ^ G p p p ^ fMet ma turation 

c A _u uGAccu pna--- 3 ’ 

30-A-U 130 

UCS U°G — 120 LDI 
U-A SD 
C-G 

io C — G no 

U-A,, i C 

_ U CCUUA 

i A 6 q(j(jAAU| 

G-C-ioo 

8Ia east 

A-U 
C-G 
C-G 

A 

so-A-U 
U*G 
G*U 
G-C-e 
C-G 
U C 
AU 


u 


SOUTH 


Equilibrium structure 
inactive in translation 


u u 

" Jc 

G n.' C . 


G 
l U 


C^ G G uG GGppp5 ’ 


U„ 


Ui 


GCU C M AC 

ui,, 

u ° U 

I 

I 


4, 


fMet mat uration 

U G AC C U IG U Gl -3' 


■- 


_/? oivn stto°^ 


Kinetic trap 
active in translation 


Figure 15-4 Equilibrium and nonequilibrium structure in the 5' UTR of MS2 RNA responsible for the transient translation of 
the A-protein gene. A: Gene location. B: The final structure is depicted that is translationally inactive. C: The folding 
intermediate is shown that allows translation. Both structures are phylogenitically conserved in the E.coli leviviruses (95). LDI, 
long-distance interaction; SD, Shine-Dalgarno sequence; UCS, upstream complementary sequence. For convenience the 
UCS nucleotides are shaded. For simplicity, potential structure involving the start codon and surrounding has not been 
drawn. 


only be caused by a “kinetic trap.” This is an alternative 
folding that is stable enough to temporarily prevent the 
RNA from reaching the equilibrium structure. 

Mutational analysis has indicated that the kinetic trap 
involves the small hairpin between nucleotide 25 and 43 
shown in figure 15-4C (95). Formation of this hairpin 
excludes the inhibitory long-distance base-pairing to the 
SD sequence, thus permitting ribosome binding to the 
nonequilibrium structure. However, eventually the RNA 
will fold up into its equilibrium structure shown in 
figure 15-4B, and translation will be shut down. It is not 
difficult to see what the potential biological purpose of this 
control might be. Due to the kinetic control of maturation 
gene translation this gene is not accessible to ribosomes on 
full-length RNA. Consequently, replicase needs only to 
compete with ribosomes that bind at the coat start site to 
create a ribosome-free template, which is needed for unim¬ 
peded synthesis of (—) strand RNAs. Furthermore, it ensures 
that the maturation gene will only be translated once or 
a few times per newly synthesized (+) strand phage RNA. 
Indeed, only one molecule of maturation protein is needed 
per (+)-strand RNA molecule. In QP RNA, the complement 
to the SD sequence of the maturation protein gene lies 
more than 400 nt downstream. As a consequence the start 
is available at least as long as the complement has not yet 
been synthesized. A kinetic trap is therefore not strictly 
required here, but may exist to prolong the accessible state 
of the start site (7). 


Control of Coat Gene Translation 

Figure 15-5 shows the MS2 RNA and QP RNA structures 
at the start of their respective coat genes. The conspicuous 
feature is that, in contrast to the rest of the genome 
(figures 15-3,15-4, and 15-6), there is a rather open structure 
formed by several single-stranded regions up- and down¬ 
stream of the initiator hairpin. This whole section forms the 
entry site for ribosomes, polymerase, and host factor (see 
later). In its wild-type form the well-studied initiator hairpin 
of MS 2 is not strong enough to downregulate translation 
of the coat gene: its destabilization does not increase coat 
protein yield. Its stabilization, on the other hand, reduces 
expression by a factor of 10 for every 1.4 kcal/mol stability 
increase (24). That is, the rate of coat gene translation is 
proportional to the fraction of RNA in which the initiator 
hairpin is in the unfolded form. One important conclusion 
from this finding is that ribosomes must wait until thermal 
motion spontaneously unfolds the hairpin before they 
can bind the start sequence. However, it can be calculated 
that this “flash exposure" lasts only about 1 ps, a time much 
too short for the 30S subunit to diffuse to and bind the 
target (25 a). 

A solution to this paradox is provided by “standby” bind¬ 
ing. Here, the 30 S ribosomal subunit is thought to bind, 
probably via protein SI, to the single-stranded regions to 
await the thermal unfolding. When this happens it can 
quickly snap in place by linear diffusion along the RNA 
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where it will be fixed by the SD sequence and by codon- 
anticodon interaction of initiator-tRNA. Finally, the 50S 
subunit joins and the ribosome is ready to go. A note¬ 
worthy feature is that the SI binding site seems not covered 
by the 70S ribosome, suggesting that SI is only involved 
in early steps in initiation. The above scheme is thought 
to be generally valid for prokaryotic translation. It is pre¬ 
sented here since it was developed first for the MS2 coat 
protein gene. 

As discussed above, the only negative control on transla¬ 
tion of the coat gene is exerted by the polymerase competing 
with ribosomes for their overlapping binding site. 

Genome Replication Requires Four Host 
Cell Proteins Plus the Replicase 

Once replicase is made, it can begin to generate new copies 
of phage RNA. The (+) stand genome is first copied into a (—) 
strand. This RNA is in turn used as a template to produce 
more (+) strand RNAs. Although fully complementary, (+) 
and (—) strands do not anneal to form a double-stranded 
RNA. Such annealing is inhibited by the high degree of 
internal secondary structure found in each single-strand 
(5,12, 79). 

When scientists isolated the RNA polymerase activity 
from infected cells, they found to their surprise that it is 


a complex made of four different proteins, only one of which 
(the replicase or P subunit) is coded by the phage genome 
(45 a). The three other proteins in the complex are host 
proteins: ribosomal protein SI (a subunit) and the two 
translation elongation factors, EF-Tu and EF-Ts (subnits y 
and 5). 0(3 polymerase contains EF-Tu, EF-Ts and the 
replicase protein in a 1:1:1 ratio. SI is rather loosely bound 
and may have a stoichiometry of <1. This protein is 
instrumental in binding messenger RNA sequences to 
the ribosome during initiation of translation in bacteria 
(81, 91). EF-Tu positions charged tRNAs on the ribosomal 
A-site during elongation of growing peptide chains. This 
activity involves binding of GTP and its hydrolysis to GDP 
during the placement of the tRNA onto the ribosome. EF-Ts 
recycles EF-Tu-GDP to EF-Tu-GTP via the intermediate 
EF-Tu EF-Ts (38). 

As mentioned above, QP polymerase binds to two internal 
sites on (+) strand RNA, one of which overlaps the start site 
for translation of coat protein. This binding is facilitated by 
the SI protein, which carries out the same function for the 
ribosome when it is binding to the coat protein start site. 
Thus the competition between ribosomes and replicase for 
the coat protein start site is in fact mediated by the same 
(cellular) protein! 

Replicase initiates RNA synthesis not at the 3' terminal A, 
but at the penultimate nucleotide, a C. (Terminal sequence 
is ... CCCAqh (45a).) Thus the 5' end of the new (—) strand 
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starts with three G residues. Once synthesis has begun, it 
can continue in the absence of SI, EF-Tu, and EF-Ts (but 
the complex usually stays intact), since the replicase protein 
has the RNA polymerase activity. Minus strands are 
completed by copying the three Gs at the 5 end of the (+) 
strand, and then the replicase adds on a single, untemplated 
A residue, so the 3' end of the (—) strand (CCCAq H ) looks like 


the initial 3' end of the (+) strand. The presence of this 
terminal A is slightly beneficial for transcription in the 
absence of host factor (see below). 

Recently, it was found that QP polymerase must dimerize 
to be active as an enzyme (49). EF-Ts mutants deficient in 
dimerization cannot be infected by single-stranded RNA 
phages (37a). 
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Mechanism of Replication 
Simple Templates 

We distinguish two kinds of template: simple ones that need 
only the core enzyme (subunits (3, y and 5) for replication, 
and complex ones that need in addition protein SI as subunit 
and another host factor, HF, the product of the hfq gene 
(15, 36, 45). There is only one complex template—Q(3 (+) 
strand—but there are many simple ones, nowadays called 
RO RNA because they can be used as template by QP repli- 
case (22). Their size is between 50 and 250 nt. Some of them 
have arisen spontaneously in vitro by incubation of poly¬ 
merase with high concentrations of NTPs (13). Others have 
been isolated from QP-infected cells or from cells expressing 
the replicase gene. They have been formed by recombination 
between fragments of QP RNA and host RNAs such as 
tRNA, rRNA or even RNA from phage X (4, 53, 54, 55). Some 
do not even carry QP sequences anymore (22). Still others 
have been designed (106), or are SELEX products selected 
for binding to the polymerase core enzyme and subse¬ 
quently equipped with a 3' tail ending in CCCoh (19). They 
have no biological function and can be considered as selfish 
RNA. Surprisingly, QP (—) strand is also a simple template. 
All RO RNAs, including QP (—) strand, have in common 
that they begin with GGG, end with the unpaired CCCAoh 
sequence and contain a binding site for QP polymerase core 
enzyme. The characteristics of the binding site are not very 
specific. All that seems to be needed is a stretch of weakly 
structured RNA, preferentially pyrimidine-rich (18-20). 
Even a simple 9-fold CCCA repeat forms a good template 
(86) and polyC has historically been used as matrix to assay 
column fractions for polymerase activity. 

Which of the three subunits is responsible for binding the 
template? In a binary complex of core enzyme and template 
the RNA can be UV-crosslinked to the EF-Tu subunit. (In an 
elongating complex the crosslink shifted to the replicase 
subunit, in agreement with its polymerizing properties 
(20).) This result suggests that EF-Tu is the subunit that 
binds the RNA to the polymerase. Consistent with such 
a finding is that QP polymerases carrying mutant EF-Tu 
subunits that are progressively impaired in tRNA binding 
capacity are also progressively impaired in replication (48). 
It remains a mystery what the common denominator is 
in all these RQ RNAs and tRNA, the natural substrate of 
EF-Tu. tRNA certainly does not have weakly structured 
stretches. Maybe one should consider that a “weakly struc¬ 
tured” region in an R0 RNA is flexible enough to fill (part 
of) the tRNA binding site of EF-Tu, which mostly involves 
contacts with the phosphate-sugar backbone anyway (57). 
Things are further complicated by the results of a fluori- 
metric assay which confirmed the strong affinity of poly¬ 
merase for polypyrimidines but showed in addition binding 
to double-stranded RNA, including some tRNAs, with 
almost equal Kj values. Whether or not these templates 


were bound via EF-Tu or via another subunit was not tested 
(68). Also, one should realize that strong binding does not 
necessarily correlate with efficient replication (14). 

The simplest picture for replication is that the template 
is anchored on the core enzyme by EF-Tu, whereafter the P 
subunit captures the CCCAq H sequence (47). On short RNAs 
catching the terminus may be nearly instantaneous, but 
on longer RNA such as QP (—) strand such a process could 
take a long time and one expects the search to be guided 
by RNA folding, which should position the RNA terminus 
near the enzymes active site. Indeed, a deletion analysis has 
shown that a long-distance interaction in the 3' terminal 
region of QP (—) strand is required for transcription (70). 

When the C-terminal 24 amino acids of replicase are 
deleted (up to Ala 565) the enzyme shows a decreased 
template specificity suggesting that initial contacts are not 
restricted to EF-Tu but may also involve the P subunit (34). 

Complex Templates 

Role of SI 

QP (+) strand is the only template that requires SI and HF 
for its copying (52,99). Both proteins stimulate transcription 
about 10-fold. Several reasons for the special status of (+) 
strand come to mind. First, (+) strand is the only temp¬ 
late serving also as messenger RNA and the dependence on 
SI enables the necessary competition between ribosome 
and polymerase (see above). Second, the (+) strand is the 
genome and the infectious agent: a single molecule must be 
able to survive in the bacterial cytoplasm at least for as long 
as it takes to translate the replicase gene and produce the 
first (—) strand. The danger comes from the messenger RNA 
degrading machinery with RNase E as endonuclease and 
several 3' exonucleases that turn over cellular RNAs, after 
these have been earmarked for decay by the addition of 
a poly (A) tail by poly(A) polymerase (82). Some data suggest 
that protection from the degradosome is the reason why 
the 3' end of the (+) strand is taken up in (long-distance) 
base-pairing (Id IX, figure 15-6). First, extensions of the 3' 
end with a few C residues, though relieving HF depen¬ 
dence and stimulating in vitro replication, are quickly lost 
in vivo. In addition, destabilizing Id IX, which also boosts 
in vitro replication, strongly reduces the titer in a wild-type 
E. coli, but much less so in a mutant devoid of poly(A) poly¬ 
merase and polynucleotide phosphorylase (a 3' exonuclease) 
(87). However, the flipside of RNase protection conferred 
by Id IX is that the terminal Cs are inaccessible and appar¬ 
ently require the help of extra proteins, SI and HF, to be 
transcribed. 

The following model for (+) strand transcription is 
proposed. As a first step polymerase holoenzyme binds the 
template at the S- and M-site via subunit SI. The arguments 
for SI involvement are that core enzyme does not bind 0(3 
RNA. Furthermore, electron micrograph pictures show 
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that complexes of 0(3 RNAwith either holoenzyme or with 
SI alone produce the same looped structure (52). The loop 
was tentatively shown to represent the RNA between S- and 
M-sites. These two sites had previously been identified 
by sequencing the RNA fragments that remained bound to 
polymerase after digesting away unbound material by 
RNase. The S-site spans the region from nucleotide 1248 
to 1347 and overlaps the coat gene start (figure 15-5). It is 
poorly structured while containing four stretches of single- 
stranded RNA (51). 

Unlike the S-site, the M-site is highly structured 
(figure 15-6) and covers a large region (nucleotide 2500- 
3050). It is by and large the same area we now call replicase 
domain 1 (RD1) (6, 41). RNase protection and UV crosslink¬ 
ing experiments showed that contacts in the M-site are 
spread out in space and mainly involve the loops of hairpins 
and other single-stranded nucleotides (figure 15-6). The 
center of the interaction is considered to be at the so-called 
M2b-site (nucleotide 2663-2788). This region forms a 
branched stem-loop which is conserved in all single- 
stranded RNA coliphages (6). Changing base pairs in any of 
the stems has no effect on in vitro replication (72), consistent 
with the presence of many base-pair covariations between 
the various phage RNAs (6). Deleting R24, R25 or R26 from 
OP RNA results in a 4-fold decrease in replication. Deleting 
the complete M2b-site lowers replication to about 10% of 
the wild-type value. It is also interesting that in a binary 
complex between phage RNA and 30S ribosomes one of the 
UV crosslinks was between protein SI and the loop of 
R26 (17). The others were with parts of the S-site (17a). 
Clearly SI in the ribosome binds the same sites as SI in the 
polymerase. 

Although there is no direct evidence that interactions 
exist between QP RNA and polymerase subunits other 
than SI, it would be surprising if the RNA binding activity 
of EF-Tu were not exploited in (+) strand transcription. One 
argument for other contacts is that in the absence of SI there 
is still a 10-20% transcription activity of the core enzyme 
(72). SI thus greatly stimulates a reaction that already 
proceeds in its absence. We propose that SI anchors the 
template on the polymerase in a standby complex. In this 
complex EF-Tu and probably the P-subunit are positioned 
at the 3' end of the template where they must await the occa¬ 
sional thermal breathing of Id IX to access the C-residues. 
The role of SI in translation is similar except that there 
the protein serves the ribosome in awaiting the thermal 
unfolding of the initiator hairpin. This scheme may also 
resolve the old paradox that a QP-bound polymerase inhibits 
ribosome binding to the coat start but a 70S initiation 
complex does not inhibit polymerase (45). We suppose that 
the Sl-mediated competition is between the two long lived 
standby complexes. For all we know, the 70S initiation 
complex, in contrast to the 30S.phage RNA standby complex 
(51), has lost its contacts with the M-site and can thus 
no longer compete with the polymerase. Indeed, it would 


make sense to arrange competition between long-lived 
precursors rather than between short-lived intermediates 
present at the end of the initiation pathway. 

Role of Host Factor 

HF is a small heat-stable protein which forms hexamers. 
It does not associate with QP polymerase but acts directly 
on and in stoichiometry with the RNA (15, 45). Its stimula¬ 
tion of replication in vitro is about 5- to 10-fold and cells lack¬ 
ing HF can still be infected, but titers are reduced. In the 
uninfected cell HF is held accountable for many effects. The 
most basic seems that the protein promotes translation 
of the er /0 messenger by relieving inhibitory RNA structure 
present at the ribosomal binding site, presumably with the 
help of a small cellular RNA (dsrA RNA) (80, 50, 21). Like SI, 
HF has affinity for poorly structured RNA regions; HF with a 
preference for purines and SI for pyrimidines. 

Weber looked under the electron microscope at com¬ 
plexes formed between QP RNA and holoenzyme in the pre¬ 
sence of HF and saw a two-loop structure. One loop was 
similar or identical to the one described above in the absence 
of HF; the second loop had formed between the 3' end and 
the M-site (52). Formation of the second loop may reflect a 
previously established HF binding site in hairpinV2 (74). 

Surprisingly, however, a complex between QP RNA and 
HF alone gave the same picture. This would mean that SI 
and HF have the same or neighboring binding positions at 
the S- and M-sites. For the M-site this was confirmed; some 
nucleotides are protected by both polymerase and HF, while 
others are exclusively protected by HF or polymerase 
(figure 15-6). Another consequence of these multiple bind¬ 
ing sites is that HF folds back the 3' end to the S- and 
M-sites where the polymerase is bound. The 3' end and the 
M-site are already close neighbors via pseudoknot Id X and 
we suppose that HF stabilizes this tertiary structure. Both 
HF and polymerase bind a stretch of six single-stranded 
nucleotide 5' adjacent to Id X (nucleotides 2966-2971) 
(figure 15-6). 

The best clue, however, about HF function stems from 
the pseudorevertant obtained by passaging QP through an 
HF-deficient host (71, 73). Five suppressor mutations were 
found that together raised transcription in the absence 
of HF from 10% to 60%. One of these (U2972C) changes a 
G-U to a G-C at the junction of the two stems of the pseudo¬ 
knot (figure 15-6). It is possible that this mutation stabilizes 
the pseudoknot and thus the three-dimensional structure 
of Op. Note that disruptions of Id X fully abolish replication 
(14). The remaining four mutations are shown in figure 15-7, 
type B (right corner). Two of these (C4146U and A4148U) 
are difficult to interpret. The other two, however, replace 
a G-C by an A-U pair in Id IX. This change clearly destabil¬ 
izes this stem, suggesting that HF somehow induces or 
protracts the presence of unpaired terminal nucleotides. 
If so, this is unlikely to be accomplished by inducing an 
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(Schuppli et al., 71) 
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Figure 15-7 Secondary structure of pseudorevertants obtained after passaging QP mutated in its 3' TD through wild-type 
(wt) or host-factor-deficient host. The center shows the wild-type QP structure together with transcription efficiency in the 
presence (+) and absence (—) of host factor. The remarkable feature is that almost all suppressor mutations occur near the 
center of the five-way junction and destabilize Id IX. Furthermore, whether passaged in wild-type or host-factor-deficient 
hosts, all pseudorevertants show strongly decreased dependence on host factor. Mutant G4129C is a mutant, not a 
revertant. El .7, E2.9, and A2.2 were sequenced only from nucleotide 2349 to the 3' end. The transcription values refer to 
templates that were wild-type from nucleotide 1 to 2349 and revertant from 2349 to the 3' end. El .7 and E2.9 also contain 
the suppressor mutation (C2405U). This change is predicted to increase translation of the replicase gene as it turns an 
inhibitory G-C pair into a G U pair. 


alternative folding because A-U —*■ U-A base pair reversals at 
positions 4128 and 4185 designed to block formation of 
several base-pairing alternatives had little or no effect 
either on HF dependence of transcription or on titer (87). 
Therefore, it seems that HF aids in opening Id IX. We do not 
know how this is achieved but one of many possibilities is by 
binding to the UGGGAG moiety upon spontaneous breathing 
of Id IX, thus slowing down the backward reaction. However, 
the action of HF is not simply to “melt" the RNA because the 
dependence on the protein increases with temperature. 

Wild-type QP produces only a single HF-independent 
revertant. To obtain more revertants we evolved several 
phages with mutations in Id IX and stem-loop VI in an 
HF-deficient host. Six different starting mutants produced 
only two additional revertants whose probable structure 
is shown in figure 15-7 (type A and type C). Again Id IX has 


been destabilized by replacement of G-C by A-U or G-U, 
though at a different position than in the Weber revertant 
(type B). In addition, G4130 is deleted and there are changes 
in the 4146-4148 region as observed in the original Weber 
revertant (73). When those six starting mutants are evolved 
in a wild-type host we obtain two revertants. Type I has two 
base pair changes in Id IX, the other (type II) is identical to 
type A except that it does not have the A4148U change 
(figure 15-7). An interesting observation is that type I and 
type II, even though obtained by passage through wild-type 
E. coli, show strongly decreased dependence on HF. At the 
same time transcription of revertant RNA is usually higher 
than that of wild-type. For type II, for instance, transcription 
in the presence and absence of HF is 330% and 200%, 
respectively, of the value obtained for wild-type. For the 
type I revertant these values are 145% and 50%. It is clear 
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that RNA replication in wild-type is negatively controlled 
and that reliance on HF is not an inevitable consequence 
of the base-paired terminal nucleotides per se. These can 
efficiently be accessed in the absence of HF as the rever- 
tants show. Somehow, the dependence on HF that is incorpo¬ 
rated in the wild-type structure makes the phage work more 
efficiently. 

Weber and coworkers cut off one arm at a time from the 
five-way junction shown in figures 15-6 and 15-7. Removal 
of Ul, U2 or VI decreased transcription to about 4%. Delet¬ 
ing V2 still left 25%. Structures of revertants obtained 
from AVI and AV2 are shown as E1.7, E2.9, and A2.2 
(figure 15-7). (Revertant substitutions or insertions are 
boxed.) Again, even though passaged in a wild-type host, 
transcription of the revertants (E1.7 and E2.9) is almost inde¬ 
pendent of HF. For instance, for E2.9 transcription without 
HF is 33%, whereas with HF it is only 20%. In wild-type 
RNA these values are approximately 10% and 100%, respec¬ 
tively (D. Schuppli, ]. Georgijevic and H. Weber, unpublished 
results). As above, the suppressor mutations destabilize Id IX 
at the side of the five-way junction. G-C becomes either A-C 
or G-U and there is again the IJ4147C change we have seen 
before. It is striking that all or almost all of the suppressor 
mutations localize at the five-way junction. The importance 
of this stucture for HF dependence is also illustrated by 
mutant G4129C, which shows 150% (+ HF) and 60% (—HF) 
transcription of the corresponding wild-type value. 

Template Specificity 

Polymerases of all four species have been isolated, but that 
of species I turned out to be unstable. All contain SI, EF-Tu, 
EF-Ts, and the phage-coded subunit. Species III and IV share 
the same host factor (HF) but MS2 and GA (32) use a differ¬ 
ent protein. These phages grow equally well in wild-type 
and in HF-deficient E. coli. The host factor for GA (GA-HF) 
was isolated (104). There is not a great deal of data about 
template specificity and the contribution of HF to specificity 
has not been examined. Nevertheless, some results are clear. 
OP polymerase (+ HF) will not copy MS2 or GA but it will 
to some extent accept species IV RNA. In contrast, GA repli- 
case (tested without its HF) will not copy species III RNA but 
it will replicate species I and, surprisingly, was reported to 
copy SP RNA (104). Broadly speaking, replicases recognize 
RNA from within their genus. No experiments have been 
reported on whether RO RNA can be replicated by SP or GA 
replicase. With regard to their simplicity one would predict 
they can. Neither do we know much about the template 
specificity of viral (—) strands. 

The specificity of the polymerases for their own (+) 
strands is remarkable considering that the major contacts 
come about through the same set of sequence-nonspecific 
host proteins: SI, HF, and EF-Tu. It has been clear for a long 
time that specificity is not the result of nucleotide sequence. 


Rather, specificity seems to be achieved by the shape of 
the RNA. Only for the cognate RNA will the 3' end be poised 
at the active site of the polymerase. Non-cognate RNA is 
predicted to be bound by the polymerase, but the 3' end 
will be at an unproductive position with respect to the |3 
subunit. The best evidence for the decisive role of RNA 
structure is that disruption of Id VIII or Id X in Q (3 RNA 
leads to complete template inactivity. Pseudorevertants of 
disrupted Id VIII have repaired the interaction with other 
base pairs (41,42). 

Evolution of Man-Made Phage Mutants; 

Flexibility and Adaptation of the 
Genome 

In 1978 the first infectious cDNA clone ever, that of QP, was 
constructed and shown to produce wild-type phages (83). Its 
potential was apparently not realized until 10 years later 
when infectious clones of OP and MS2 were prepared by 
Shaklee and ourselves and used for reversed genetics and 
other purposes. It was shown, for instance, that homologous 
RNA recombination does exist in RNA phages (frequency 
10 8 ) and that QP polymerase seems able to copy the MS2 
(—) strand. That MS2 polymerase would transcribe QP (—) 
strand could not be shown (75, 76,62). 

Today infectious clones are exploited to answer a variety 
of questions that are difficult to solve otherwise, such as 
the importance of RNA structure elements for fitness. 
As an example we studied the role of the small hairpin 
containing the translational start signals for the coat gene. 
Accordingly, we constructed mutant 45.0 by substitutions 
at codon wobble positions and at upstream noncoding 
nucleotides (figure 15-8, center). These changes lowered the 
titer of the infectious clone by 4 orders of magnitude. Upon 
evolution this mutant produced multiple pseudorevertants 
that had accumulated suppressor mutations that restored 
the thermodynamic stability of the hairpin. As a result, the 
phage titer also returned to the wild-type value. This evolu¬ 
tionary process usually involved several subsequent base 
changes that stepwise improved the phenotype (58). 

The conclusion from such an experiment is that the 
precise sequence of the coat initiator hairpin is not so impor¬ 
tant but that its stability is crucial. Separate expression 
experiments, already mentioned above, have shown that 
a hairpin more stable than wild-type dramatically decreases 
coat gene translation (24), and this explains the return to the 
wild-type stability of mutant 45.0. Surprisingly, destabilized 
hairpins though producing wild-type levels of coat protein 
also revert to wild-type stability. Possibly, the weaker initia¬ 
tor hairpin results in an undesired advantage for the ribo¬ 
some in its competition with the polymerase. 

A second example concerns the question whether or 
not the coupling between coat and lysis gene translation 
in MS2 is a coincidence resulting from the coding overlap 
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Figure 15-8 Evolution of the stabilized coat-protein initiator hairpin shown as mutant 45.0 in the center. Black boxes in 45.0 
are man-made mutations. Black boxes in the pseudorevertants 45.1-45.8 are suppressor mutations selected by nature upon 
passaging the mutant phage. Numbers at arrows (e.g. 4x) show the number of plaques containing that specific suppressor 
mutation. All pseudorevertants have evolved an initiator hairpin that has a thermodynamic stability close to wild-type 
(boxed at top). Taken from (58). 


and the prevailing RNA structure or a biological necessity. 
Accordingly, the lysis hairpin (figure 15-3) which is directly 
responsible for this coupling was mutagenized 5' to the lysis 
start at neutral coat-coding positions. The mutations did 
not change the amount of lysis protein made, but the 
production was no longer dependent on coat translation. 
At the same time the titer of the mutant infectious clone 
dropped 4 logs compared with wild-type. The various pseu¬ 
dorevertants that were analyzed showed several second- 
site suppressor mutations. Their titer had again reached 
wild-type level and, most importantly, production of lysis 
was again under control of the coat gene. Therefore, the 
coupling is a biological necessity that adds to the fitness of 
the phage. Further analysis showed that control had not 
been regained by building a variant of the original hairpin 
(as in the first example). Instead, the initial man-made muta¬ 
tions had fully destroyed the lysis hairpin but favored 
several alternative foldings. These alternatives were stabil¬ 
ized in the pseudoreverants to the extent that independent 


access of ribosomes to the L start was excluded. At the same 
time termination-dependent reinitiation could still take 
place (40). 

Besides showing the basics of Darwinian evolution these 
experiments demonstrate the importance of viral RNA 
structure. Without changing the coding content of the 
genome, the mutations in these two examples decreased 
the titer by 4 orders of magnitude. 

More severe insults to viability are deletions, because 
their repair generally requires a duplication of a nearby 
sequence together with its adjustment to the specific needs 
at that position. The probability of finding such revertants 
in the quasi-species pool is very low and deletions are there¬ 
fore often lethal. Sometimes, however, nearly incredible 
solutions are found to neutralize deletions. A 19 nucleotide 
deletion introduced in MS2 cDNA in the intercistronic 
region between maturation and coat genes completely 
removed the SD sequence of the coat as well as the initi¬ 
ator hairpin. Production of coat protein was undetectable. 
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The deletion caused a 10 10 times decrease in titer leaving 
only several plaques instead of the usual 10 11 for wild-type 
MS2 cDNA. These plaques showed two solutions. In one, a 
specific 14 nucleotide duplication near the site of the dele¬ 
tion restored both SD sequence and the hairpin (figure 15-9, 
Rev 2.1). In the other the same two features were reinstal¬ 
led by a precise deletion of 6 nucleotides (figure 15-9, 
Rev. 1.1) that separated the two halves of a potential SD 
sequence (GGNNNNNNAG). Thereafter both pseudorever- 
tants further improved themselves by base substitutions 
and sometimes an additional small insertion. Most but not 
all of these further changes could be rationalized (legend to 
figure 15-9) (61). 

In another example the replicase-operator hairpin of 
MS2 was randomized. In one of the obtained mutants, 
AL20, this resulted in two consecutive stop codons in the 
lysis reading frame (marked by a line in figure 15-10). The 
titer dropped 7 logs and survivors (AL20.1) had deleted 
the complete operator sequence (15 nucleotides) to get rid of 
the stop codons. In a next step six additional nucleotides 
were dismissed to produce the STRANGE pseudorevertant 
(46). The selection pressure for this second step is not 
known. Above we have seen that the operator hairpin plays 
important roles in shutting off replicase synthesis and in 
encapsidation. Its deletion shows that the shut-off of repli¬ 
case is not essential but improves fitness of the phage. 
Furthermore, the observation that virions can still be 
formed shows that there must be secondary coat protein 
binding sites in the RNA, as suggested by Peabody on the 
basis of similar evolutionary experiments (66). 

These evolutionary games show that there are several 
layers of sophistication in phages. Many interesting features 
such as regulation of translation, and efficient and coordi¬ 
nated encapsidation, are not essential for existence as a 
phage, but they add greatly to fitness. The generation of 
such elaborate mechanisms is unavoidable and is a direct 
consequence of the everlasting pressure to outcompete 
fellow sequences. In this respect it is relevant that none of 
the pseudorevertants ever isolated by us could successfully 
compete with wild-type. It is clear that in wild-type every 
nucleotide is carefully selected for optimal performance. 
This explains also why the sequence of an RNA virus is 
so stable in spite of the inaccuracy of the polymerase (~ 1 
error per 10-10 4 nucleotides). 

Pressures on RNA Structure by 
Host RNases 

We have seen that RNA structure can make an active 
contribution to fitness, for instance by regulating translation 
and replication. However, one should realize that there 
are many other constraints that affect the final shape of the 
RNA. A well-known example is that large unstructured 
regions must be avoided as they may catalyze full duplex 


formation between (+) and (—) strand (5). In addition, such 
unstructured regions may be targets for cellular RNases. 

When a noncoding hairpin in MS2 RNA was extended 
either with a large open loop of 26 nucleotides (figure 15-11, 
mutant A) or an almost fully base-paired stem (mutant C), 
the stem turned out to be genetically stable but the large 
open loop suffered deletions until an acceptable size had 
been reached (5-10 nucleotides; top left) (60). A similar 
observation was made when 0(3 RNA from which a part of 
the read-through protein gene had been deleted was evolved 
in a host supplying this protein in trails. Plaques of these 
phages were initially small and titers low, but fitness 
improved upon passaging. Sequence analysis showed that 
unpaired RNA regions that resulted from the initial deletion 
had been removed, leading to a further abutted RNA devoid 
of extensive unpaired segments (3, 35). 

A direct role for RNases in influencing the structure of 
phage RNA is not easy to prove since mutants devoid of all 
exonucleases or lacking endonuclease RNase E are not 
viable. There is, however, a null mutant for RNase III, an 
endonuclease active in maturation and breakdown. Its 
target is an uninterrupted double-stranded RNA stem of 
sufficient length (~ 17 bp) (23). When such a stem was incor¬ 
porated in MS2 (figure 15-11, mutant R), it was genetically 
stable in an RNase III mutant but not in wild-type. Here 
the stem evolved in many different directions, all leading 
to RNase III resistance (top right). Some pseudorevertants 
had reduced the length of the stem or had created one or 
more mismatches. Others showed a deletion on either side 
leading to big bulges. A last but most interesting category 
had bulges caused by the insertion of untemplated A or U 
residues (39). These residues turned out to be added by the 
host enzyme poly(A) polymerase as witnessed by the fact 
that such As or Us were lacking in a host devoid of this 
enzyme (94). In trying to reconstruct the events it was 
inferred that RNase III cleaves the substrate stem (either 
(+) or (—) strand) on one of its sides; then poly(A) polymer¬ 
ase adds on As to the 3' OH created by the cleavage, thus 
earmarking the molecule for destruction by 3’exonucleases. 
Meanwhile MS2 polymerase can begin to copy such a poly- 
adenylated or partially degraded RNA. Arriving at the site of 
cleavage the polymerase with the nascent chain apparently 
detaches and reinitiates somewhere on the other side of 
the cut. This may be on the added poly(A) tail or further 
upstream. The release and relanding of the polymerase at 
the physical end of the template has been observed before 
and has been called “run off recombination” (100). 

Assembly and Release of Virions 

The assembly of virions has not been studied in great detail. 
For MS2 it has been deduced that the first step in this 
process must be the binding of the maturation protein to 
the RNA (44). Two RNA regions are involved. One lies in the 
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Figure 15-9 Evolutionary reconstruction of the deleted coat-protein initiator hairpin in MS2 RNA. A: At the sequence level 
and B: at the level of secondary structure. In the starting deletion mutant (bottom left) 19 nucleotides have been deleted and 
coat protein synthesis could no longer be detected. In Rev2.1 a specific 14 nucleotide duplication (bold italics) has occurred 
that restores both the initiator hairpin and the A-protein terminator hairpin as well as the Shine-Dalgarno (SD) sequence. 
Panel B shows that in Rev3.1 and Rev2.2 the terminator is further polished by replacing the C-G mismatch by G U and G-C, 
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Figure 15-10 Evolution of MS2 mutant AL20. This mutant 
contains multiple point mutations in the replicase operator 
hairpin, some of which have created two consecutive 
stopcodons in the reading frame of the lysis gene. Evolution 
deletes the complete operator including the stopcodons and 
restores the lysis function at the expense of the operator. 


A-protein gene around position 400. The other involves 
hairpinV2 in the 3' UTR (figure 15-12) (78). It is conceivable 
that maturation protein binding to this site, which in QP 
is an HF binding site, inhibits replication and could thus 
serve as a regulatory signal. The MS2 RNA-maturation 
protein complex can be made in vitro and it is infectious 
for F + cells (78). 

The second step in virion construction is the assembly 
of the coat protein around the RNA to form an icosahedral 
shell. This reaction does not depend on the A-protein. The 
nucleation point for this “coating” reaction is the replicase 
operator hairpin (figure 15-3). Coat protein will form cap¬ 
sids even in the absence of phage RNA. However, capsid 
formation occurs at a much lower concentration of coat 
protein if phage RNA is present, assuring that empty cap¬ 
sids will not form and all phage RNA becomes packaged 
(102,103). 

The infection cycle ends when the accumulation of 
the lysis protein in the cytoplasmic membrane has, in an 
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Figure 15-11 Acceptance or rejection of various RNA 
structure elements introduced in a noncoding region of 
MS2. At the left we show the wild-type structure, containing 
the stop codon (UAC italics) of the maturation gene. Mutant 
A has been given a big unstructured loop that is almost fully 
deleted during passaging (double arrows). Mutant C, 
containing a nearly perfect hairpin extension, is genetically 
stable for many generations. Mutant R with the perfect 
hairpin extension is an RNase III substrate. In RNase Ill- 
deficient strains the stem is genetically stable but in wild- 
type E. coli suppressor mutations occur that make the stem 
RNase III resistant (bold italics, top right). We show only 
some representative examples of the pseudorevertants 
obtained. 


indirect manner, caused the collapse of the cell wall. This 
75-amino acid long hydrophobic protein appears to short- 
circuit the cytoplasmic membrane by forming pores (27). 
Loss of membrane potential then leads in an unknown way 
to degradation of peptidoglycan. In electron micrographs 
one sees usually that only a very small section of the saccu- 
lus (the cell wall network) has dissolved, often at the equator¬ 
ial growth zone. Bacterial growth is required for cell lysis 
and in this respect there is a parallel with the action of 
penicillin (97). 

There are no distinct motifs in the lysis proteins of the 
group A phages except for a strong clustering of hydro- 
phobic amino acids near the C-terminus. The activity of 
the MS2 lysis protein is enclosed in the C-terminal 35 or so 
amino acids. The first 40 amino acids are dispensable (8 a). 
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Figure 15-12 Comparison of RNA secondary structures in the 3' UTR in four representative RNA phages. SP is a species IV 
allolevivirus. PP7 and AP205 resemble SP in their simple 3' UTR structure. MS2 has a more elaborate folding with many 
extra elements (shaded). The stopcodon of replicase is boxed (R stop). Also boxed is a conserved sequence in the top of 
U1 (UGCUU). In QP and SP these nucleotides form a pseudoknot with their complement 1200 nucleotides upstream 
(figure 15-6). For MS2, PP7, and AP205 a similar interaction is supposed to exist. 


There may be a similarity between the L-protein of the 
leviviruses and the gene E product of single-stranded DNA 
phage <1>X174 in the way these proteins induce lysis. The 
E-protein seems to interfere with the activity of the E. 
coli translocase I (MraY). This integral membrane protein 
catalyzes the formation of a murein precursor (11,105). 

Single-Stranded RNA Phages in Other 
Bacteria and Their Phylogenetic 
Relationships 

For historical reasons the RNA phages of E. coli have received 
most attention and almost all of what we know has been 
derived from either QP or MS2 (R17, f2). Nevertheless, we 
may assume that very many, if not all. Gram-negative 
bacteria have their own RNA phages. Studying these will 
help us to determine which phage properties are general 


and which are specific for coliphages. In addition, such 
a comparative study may help to construct a plausible phylo¬ 
genetic tree. 

In the past single-stranded RNA phages have been iso¬ 
lated from Caulobacter and Pseudomonas (77). These infected 
their host via polar pili, making it clear that the F-pili path¬ 
way found for the coliphages is not a general feature. Lately, 
an RNA phage, AP205, growing in Acinetobacter, was discov¬ 
ered. The pilus used for infection has not been identified. 

Only two non-coliphages have been sequenced: AP205 
(43), and PP7 (59) which infects Pseudomonas. Nucleotide 
sequence identity among phages growing in different 
bacteria is close to random except for regions where con¬ 
served protein sequences are encoded, notably in parts of 
the replicase. 

PP7 has the length and genetic map of MS 2 or GA and 
was thus classified as a Levivirus (figure 15-1). There is no 
criterion to further assign PP7 to species I or II. AP205 on 
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Figure 15-13 Proposed phylogenetic tree for the single-stranded RNA phages. The prototype combines infection and cell 
lysis in a single protein, A(L), and has a simple structure at the 3' end as SP and PP7. The branching off of the alloleviviruses 
partly relieves protein A(L) from its dual role by creating the read-through protein which helps in infection. In the leviviruses 
the A(L) protein is relieved from its lysis function by evolving the dedicated lysis protein. In coliphages GA and MS2 we 
observe the development of increasingly elaborate 3'-end structures. A similar tendency can be seen between SP or NL95 
(species IV) and Q|3, MX1, and Mil (species III). 


the other hand is about as long as QP or SP, but consider¬ 
ing that AP205 does not contain a read-through protein 
the phage was classified as a Levivirus. It is also unusual 
in the sense that its putative lysis gene occurred not at 
the classical position but appears at the 5' terminus (0RF1 
in figure 15-1). 

Despite large differences in hosts and RNA seguences, 
common structural features can be recognized in all single- 
stranded RNA phages: (i) the extremely strong 5' hairpin, 
(ii) relatively high single strandedness at the RBS of the 
coat gene, (iii) replicase operator (43), (iv) long-distance 
pseudoknot, (v) the folding of the 3' UTR (see below) and 
(vi) a conserved pentanucleotide sequence in the loop of 
the 3' terminal hairpin (figure 15-12) (43). There might 
even be a more extensive common pattern in the overall 
RNA folding, but at present we have not enough certainty 
about the RNA secondary structure of PP7 and AP205. 

A most difficult question is the phylogenetic relationship 
between the known phages, especially between coliphages 
and non-coliphages and between leviviruses and allolevi¬ 
viruses. Usually relationships are based on sequence identity 
between corresponding genes or proteins and this proce¬ 
dure is reliable when closely related phages need to 
be ordered or classified, for example to show that JP34 is 
species II and not I or that MX1 is species III rather than IV 


However, when we must order MS2, 0(3, PP7, and AP205 
the nucleotide or amino acid sequences provide little 
direction. The mutation rate of RNA phages is extremely 
high and as we have seen there can be many nucleotide 
changes within a few hours. For such and other reasons 
the RNA sequence may not reliably reflect the evolutionary 
history but rather show the “needs of the day,” which may 
be quite different from those of yesterday or tomorrow. 
Therefore, to compare distantly related phages it may 
be better to look for features that are more stably inherited 
than nucleotide or amino acid sequence. One such stable 
feature is the genetic map. Another, as we have seen 
above, is the RNA folding. A conspicuous landmark in this 
respect is the structure of the 3' UTR which, in the 
coliphages, is a hallmark of its genus and even species 
(2). 0(3 and SP have a short simple version forming a 5- and 
4-way junction, respectively. MS2 on the other hand has a 
much longer and complexer 3' UTR (figure 15-12). Surpris¬ 
ingly, PP7 and AP205, though showing a Levivirus map, 
have the simple Allolevivirus-type 3' end. Apparently, 
AP205 and PP7 take an intermediate position between 
MS2 and QP type phages. 

Respecting genetic map and structure elements as deeply 
inherited traits and also assuming that evolution proceeds 
from simple to more complex, we arrive at the tree shown 
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in figure 15-13. The proposed common ancestor has features 
that are common to all RNA phages known today. This proto¬ 
type contains only the three major genes. There is neither 
read-through nor a separate lysis gene, and the 3' UTR 
has the short simple version. In this prototype the lysis func¬ 
tion can be carried out by the maturation protein as it still is 
today in Q(3. To arrive at alloleviviruses we need only a dupli¬ 
cation or insertion (and its subsequent adaptation) to 
produce the extra read-through (RT) protein. On the other 
hand, to arrive at the present-day leviviruses it suffices to 
exploit a basic property of the bacterial ribosome, namely 
its restart capacity. By adjusting wobble positions in the 
early replicase gene, such restarts at the end of the coat 
gene can produce hydrophobic peptides, an important 
condition for lytic activity (MS2, GA, PP7). In AP205, lysis 
has evolved in a vacant part of the genome. As suggested 
(16), the pressure for these events may have come from 
the possibly inefficient double function of the maturation 
protein in the ancestor. The evolution of lysis and read- 
through protein can thus be considered as two different 
solutions for the same problem. In leviviruses the inefficient 
lysis property of the maturation protein is solved by develop¬ 
ing a dedicated lysis gene, whereas in alloleviviruses 
the inefficient infection is improved by developing the read- 
through protein allowing maturation to become a better 
lysis protein. The simplicity of this scheme is attractive, 
even if it places the two coliphages MS2 and OP at the 
greatest possible evolutionary distance. The evolution of the 
elaborate 3' UTR in MS 2 and GA remains unexplained. 
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Phages with Segmented Double-Stranded 
RNA Genomes 

LEONARD MINDICH 


T he first member of this family of bacteriophages, the 
Cystoviridae, was isolated in 1973. This phage, named 
<f>6, was found to have a genome of three double-stranded 
RNA molecules packaged within a polyhedral nucleocapsid 
surrounded by a lipid-containing membrane (47, 52). The 
basic outlines of the life cycle of <f>6 were described in 
several reviews (25, 29). Since that time, a number of impor¬ 
tant advances have been made. The structures of the virus 
and subviral particles have been defined through the appli¬ 
cation of cryo-electron microscopy (1,7). The structure of the 
RNA-dependent polymerase has been elucidated through 
X-ray diffraction studies on protein crystals (2, 3). The reas¬ 
sembly of inner core particles of both (j)6 and (j)8 has been 
accomplished with pure individual proteins (21, 40). The 
packaging of the genomic segments has been elucidated 
and several reverse genetic systems have been developed 
(27, 28, 48). Recombination between the genomic segments 
has been described (26). In addition, a number of relatives 
of (j)6 have been isolated from nature, enlarging the family 
of Cystoviridae (31). 

The Family of Cystoviridae 

Bacteriophage (j)6 and other members of the Cystoviridae 
have genomes of three double-stranded RNA segments. 
They are designated L, M, and S and their respective sizes 
are about 6.5 kbp, 4 kbp, and 3 kbp (figure 16-1). The L seg¬ 
ment contains the genes for the proteins of the inner core 
structure, which is also called the polymerase complex or 
the procapsid. Segment M contains genes for host attach¬ 
ment and segment S contains genes for membrane structure 
and assembly as well as lytic proteins and protein P8, which 
in some cases forms a shell around the inner core. The inner 
core is a dodecahedron when empty and is composed of 
four proteins: PI, P2, P4, and P7 (figure 16-2). PI is the 


major structural protein of the inner core and 120 molecules 
of PI are arrayed in a pattern that is common to most dsRNA 
viruses that infect eukaryotic cells. P2 is the RNA dependent 
RNA polymerase. 

All the members of the Cystoviridae except c(>8 have a shell 
of P8 around the filled inner core. In the case of <f>8, protein 
P8 is a component of the membrane. All members of the 
family have a membrane as the exterior structure of the 
virion. (j)6 and its closest relatives have a host attach¬ 
ment structure composed of a hydrophobic protein P6 that 
anchors the attachment specificity protein P3 to the 
membrane. These phages attach to a specific type IV pilus 
on the host bacterium. The other members of the family— 
c|>8, 4>12 and 4>13—attach to rough lipopolysaccharide (LPS) 
on the host cells (31). This is mediated by a structure 
composed of a number of large proteins designated as P3a, 
P3b, and/or P3c. There is no sequence similarity between 
the two classes of proteins. Although the host specificity 
of the phages is primarily to pseudomonads, particularly 
Pseudomonas syringae, mutants of the phages can plaque on 
other pseudomonads such as Pseudomonas pseudoalcaligenes. 
The phages that attach to rough LPS can infect rough LPS 
mutants of other Gram-negative bacteria, but do not form 
plaques on them. However, 4> 8 can form plaques at high effi¬ 
ciency on heptoseless mutants of Salmonella typhimurium 
(17). The family Cystoviridae now consists of nine isolates, 
ft seems reasonable that there are many more members in 
nature and that the host range will be extended to other 
Gram-negative bacteria. Although these phages are some¬ 
what more fragile than the double-stranded DNA phages, 
they appear to have found a niche that becomes available 
after a population has responded to infection by phages 
that attach to normal LPS. <f>6 was isolated from bean 
plants: a number of the newer phages were isolated from 
snow pea leaves and others were isolated from the leaves of 
carrot, radish, basil, and tomato plants (31). 


197 



198 PART III: CUBIC AND FILAMENTOUS PHAGES 


Segment L 
6374 bp 


Gene 14 

Gene 7 


Gene 2 


Gene 4 


Gene 1 


Segment M 
4061 bp 

Gene 10 Gene 6 Gene 3 Gene 13 



Segment S 
2948 bp 


Gene 8 Gene 9 

Gene 12 Gene 5 


Figure 16-1 The genomic segments of cj>6. Proteins P7, P2, P4, and PI comprise the inner core particle. Proteins P6 and P3 are 
the host attachment apparatus while P10 and P5 are involved in host cell lysis. P5 is also involved in host penetration. P9 is 
the major membrane protein, which is inserted into membrane through the action of PI 2. P8 forms a shell around the inner 
core particle. Each segment has a unigue pac region near the 5' end of the (+) strand. 
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Figure 16-2 A simplified diagram of cystovirus structure, 
double-stranded RNA is contained inside a core of PI, P2, P4, 
and P7. This structure is within a shell of P8 (51) in all 
members except t|)8, where P8 is a membrane component. 
The lipid-containing membrane surrounds the P8 shell. 
Attachment proteins P6 and P3 are in the membrane along 
with P9 (the major membrane protein) and P10. See 
thebacteriophages.org/frames_0160.htm for a color version 
of this figure. 

Structure of the Inner Core 

Four proteins coded by segment L assemble to form the 
procapsid or inner core of these viruses. The proteins are 
designated PI, P2, P4, and P7. PI is the major structural 


protein and is present in 120 copies (6, 7). P2 is the RNA- 
dependent RNA polymerase, responsible for both (+) and 
(—) strand synthesis (14, 53). There are probably 12 copies 
per virion. P4 is an NTPase that is necessary for the packag¬ 
ing of (+) strand RNA molecules into the procapsid (13). 
It appears as a homohexamer at each of the 12 5-fold faces 
of the cj>6 procapsid (7), but in lower proportions in the cj>8 
procapsid (49). Protein P7 plays an auxiliary role in assem¬ 
bly, packaging, and RNA synthesis (14, 19). The empty 
procapsid has a polyhedral shape with 5-fold vertices 
pulled toward the center of the structure (7). Filled particles 
have a more spherical shape, with the P4 turrets extended 
(figure 16-3) (1). 

The procapsid structure is somewhat unstable and 
freeze-thawing in high salt leads to it falling apart into its 
constituent proteins. It has been possible to isolate the indi¬ 
vidual proteins and then to reassemble the particles at high 
efficiency (21, 40). In this manner it has been possible to 
study the kinetics of assembly of the core and to reconstitute 
infectious virions from the component proteins and RNA. 
The kinetics of assembly suggest that nucleation of (j)6 and 
cj>8 involves tetramers of protein PI interacting with P4 
hexamers and, in the case of cj>8, molecules of P2. 

Genomic Packaging 

Purified procapsids can be isolated from cultures of E. coli 
carrying plasmids with cDNA copies of genomic segment 
L. These structures are capable of recognizing cognate 
RNA, packaging the RNA, replicating it as (—) strand to 
form double-stranded RNA and finally to form (+) strand 
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Figure 16-3 Phage 4>6 inner core particles. A: Cryo-electron 
microscopy reconstructions of empty procapsids (7). 

B: Filled procapsids (1). The structures on the 5-fold faces 
are the multimers of the NTPase, P4. 

messenger RNA which is extruded from the particle (27). All 
these reactions can take place in vitro with no accessory 
factors from the host cells. The particles can even carry out 
recombination between the RNA molecules (26). 

Using the in vitro system, it has been possible to work out 
the rules governing genomic packaging. (+) strand RNA is 
packaged in the order s:m:l with a requirement for hydrolysis 
of NTR The NTP efficacy in packaging reflects the activ¬ 
ity for NTPase of the hexameric form of protein P4 in c(>6 
(13, 20). The sequences are identical for 18 bases in <f>6 with 
the exception of the second nucleotide, but vary in the other 
phages (figure 16-4). The 5' end of the (+) strands has a 
sequence that is very AU-rich. This is probably to facilitate 
melting for transcription. Each of the (+) strand copies has 
a pac sequence near, but not at the 5' end. These sequences 
in <f>6 are approximately 200 nucleotides in length. They 
are completely different for each segment with virtually no 
sequence identity between them (12). These sequences 

I GUAAAAAAACUUUAUAUA 
° 6 m, s GGAAAAAAACUUUAUAUA 

I GAUAAAAACUUUAUAUU 
<1)13 m GGAAAAAACUUAUAUUU 
s GGAAAAAAC UUUUAUAA 

I ACAAUAAAAUAAUACAA 
<j)12 m GAAUUAAUUAAAUUUUA 
s GAAUAAAC UAAAUAAAG 

I GAAAUUUAUUUAAAGCU 
<t>8 m GAAAUUUUCAAAGUCUU 
s GAAAUUUUCAAAUCUUU 

Figure 16-4 5' sequences of (+) strand transcripts of 
genomic segments. 
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Figure 16-5 Serial dependence of genomic packaging. 
Packaging of radioactive (+) strands of exact copies of 
genomic segments S and M and of a truncated segment L 
of c()6. Radioactive transcripts of plasmids were incubated 
with procapsids, treated with RNase I, and applied to a 2% 
agarose gel. 


promote specific secondary structures with a number of 
stem-loops (27, 39). The (+) strands also have stem-loop 
structures at their 3' ends. These structures are primarily 
for nuclease resistance but the terminal sequences do play 
a role in polymerase specificity (35). The (+) strand mole¬ 
cules of S can be packaged alone, but M requires the prior 
packaging of S and packaging of L requires the prior packag¬ 
ing of M (figure 16-5). (—) strand synthesis begins when 
the full complement of RNA is packaged (10). (+) strand 
synthesis begins when the entire genome is converted to 
double-stranded RNA. The (+) strands are packaged from 
the 5' ends. RNA with strong hairpins at the 3' end is pack¬ 
aged but the hairpin structures remain outside the particle 
(44). Studies with isolated hexamers of P4 of <J>8 and 4>12 
show that RNA passes through the hexamer in a 5' to 3' 
direction (22). 

A model that rationalizes the peculiarities of packaging 
has been proposed (figure 16-6) (27). It suggests a number 
of predictions that have been borne out experimentally. 
The model proposes that the outside of the empty procap¬ 
sid has binding sites for the (+) strand of segment S. The 
RNA binds and its 5' end is positioned in the pore of a P4 
hexamer. The NTPase acts as a motor to pull the RNA into 
the particle. As a result of this uptake, the particle expands 
and the binding sites for S are lost and binding sites for M 
appear. The process is repeated, resulting in the loss of the 
binding sites for M and the appearance of binding sites 
for L. Once segment L is packaged, another conformational 
change results in the activation of the polymerase to begin 
(—) strand synthesis. Upon completion of (—) strand syn¬ 
thesis, the RNA content of the particle is doubled, resulting 
in a further conformational change in the particle, resulting 
in the onset of (+) strand transcription. 
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Segment S (3kb) Segment M (4.1 kb) Segment L (6.4kb) 
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Figure 16-6 The packaging model. A: The empty procapsid shows only binding sites for S. After a full-size S is packaged, the 
S sites disappear and M sites appear. After a full-size M is packaged, the M sites disappear and L sites appear. After a full-size 
L is packaged, (—) strand synthesis commences. After (—) strand synthesis is completed, (+) strand synthesis commences. 
B: If segment S is of the size equal to the sum of both S and M, the S sites will disappear and the L sites will appear and 
segment L will be packaged without segment M. See thebacteriophages.org/frames_0160.htm for a color version of this 
figure. 


Packaging of RNA can be studied easily in the 4)6 system 
because RNA that is inside the core particle is resistant to 
RNase I (9, 41). Packaging of a normal (+) strand molecule 
can be inhibited by excess amounts of smaller homo¬ 
logous RNA as long as the competing RNA has a proper pac 
sequence (45). However, it is possible to construct competing 
RNA molecules that cannot themselves be packaged. Reduc¬ 
ing the distance between the 18 base consensus sequence at 
the 5' end and the pac sequence does this. If the deletion is 
too great, the RNA does not compete or package; but there 
is a sequence reduction where the RNA is not packaged 
but does compete. This supports the contention that the 
specificity of packaging is determined on the outside of 
the particle. Additional studies have shown that binding 
of RNA to the surface of 4>6 procapsids is specific in the 
presence of ATP (42). The (+) strand of S binds well but 
those of M and L do not bind under normal conditions. 
Crosslinking of RNA to procapsids has shown an association 
with the region between amino acids 98 and 155 in PI, the 
major structural protein of the inner core (42). PI of cj>6 
has 769 amino acids. 


RNA molecules that are smaller than their normal size 
are packaged as long as they contain a proper pac sequence; 
however, the packaging system brings in an amount of RNA 
that approximates the normal weight of that RNA segment. 
If the molecule is half-size, then two are packaged; if it is 
quarter-size then four molecules are packaged. This is 
consistent with the idea that packaging specificity changes 
as a result of the expansion of the procapsid surface due 
to the amount of RNA packaged. It was predicted that 
a molecule with the pac sequence of S with the size of S plus 
M would promote the packaging of segment L without the 
requirement for segment M. This was borne out. Similarly, 
it was found that packaging a normal S segment along with 
a segment with the M pac sequence but the size of M plus 
L would turn on (—) strand synthesis without the require¬ 
ment of RNA with L sequence. The procapsids were also 
capable of packaging an RNA molecule equal in size to the 
entire genome of cj>6,14 kb, which then promoted ( —) strand 
synthesis followed by (+) strand synthesis. 

These finding were corroborated with experiments in 
which living virus was constructed with genomic segments 




PHAGES WITH SEGMENTED DOUBLE-STRANDED RNA GENOMES 201 



v a b c d 


Figure 16-7 Agarose gel electrophoresis of double-stranded 
RNA isolated from virions. Lane V shows the distribution of 
normal segments L, M, and S. Lane a shows double-stranded 
RNA from bacteriophage c|>2007, which has a deletion in 
segment M. Lane b shows RNA from c|)2064, which has 
normal L, an MS chimera picked up from pLM1114, and a 
normal segment S. Lane c shows RNA from 4>2323, which 
contains normal L, the MS chimera shown in b, and a deleted 
segment S that contains no genes and is only 798 bp. Lane d 
shows RNA from 4>2361, which contains normal L, a chimera 
of S and M, but no normal segment M or S. 


that were chimeras of segments S and M. In the case where 
S was at the 5' end, the chimeric molecule replaced both the 
S and M segments; however, when M was at the 5' end, 
the virus required an independent segment S for viability 
(figure 16-7). It was also possible to prepare virus with a 
single genomic segment carrying all the genetic information 
of the normal genome but with the pac sequence of S 
(figure 16-8) (36). 

c[)6 can establish a stable carrier state in infected cells. 
If virus is prepared with a reporter gene such as resistance to 
kanamycin [kan) in segment M, one can isolate kanamycin- 
resistant colonies after infection (34). The frequency of form¬ 
ing carrier state infections is modified by mutations in gene 
5 of 4>6, so that mutants can be isolated that form carrier 
state in about 10% of infections. Carrier state colonies can 
be passaged on plates containing antibiotic. It has been 
observed that in some cases, after many passages, deletions 
appear in segment S. In some cases, the segment is lost 
entirely. This is an apparent anomaly because the packag¬ 
ing model requires that segment S be packaged before the 
others. When the sequence of segment L in these carrier 
state colonies was determined, it was found that a mutation 
had occurred in gene 1, which codes for the major struc¬ 
tural protein of the inner core. Procapsids produced with 
the mutated form of PI were found to package segments M 
and L without the participation of S. Apparently, the muta¬ 
tion resulted in the loss of accessibility to the S binding 
site in the empty procapsid, so that the system behaved as 
though S were already packaged (37). 



v a 


Figure 16-8 Agarose gel electrophoresis of double-stranded 
RNA isolated from virions. Lane v shows the distribution 
of normal segments L, M, and S. Lane a shows double- 
stranded RNA from 4>2515, which contains the entire 
genome of 4>6 in one segment. The migration of the RNA 
indicates a size of about 14,000 bp. 

The finding that amino acid changes in protein PI can 
change the packaging program suggests that this protein is 
the major determinant of packaging specificity. This is sup¬ 
ported by the finding that bound RNA can be crosslinked 
to PI (42). Directed changes in the pac sequences of 4>6 
segments S and M have been prepared and some of these 
reduce packaging by as much as 10,000-fold. Suppressor 
mutations that can accommodate the mutant RNA have 
been isolated and they map in gene 1 (42). A number of 
mutants in gene 1 have been isolated that result in the loss 
of packaging activity. Inner cores can be assembled lacking 
proteins P7 or P2 (4, 14, 19). In neither case is packaging 
specificity altered; however, the particles missing P 7 show 
reduced packaging activity. Although protein P4 is the 
motor for packaging, there is no evidence that it is involved 
in the specificity of packaging. The location of the RNA 
binding sites on PI and the mechanism by which the binding 
program changes have not been determined. 

Reverse Genetics 

The first system for reverse genetics in 4>6 made use of the 
in vitro packaging and replication system (32). The Bamford 
laboratory had found that nucleocapsids of 4>6 could infect 
spheroplasts of the host although they could not infect 
normal bacteria because they lacked the attachment pro¬ 
teins and the ability to penetrate the outer layers of the cell. 
They found that removing the shell of protein P8 led to 
the loss of infectivity, but that purified P8 could be reapplied 
to the particles in the presence of calcium ions. Core parti¬ 
cles that had packaged RNA transcripts of cDNA plasmids 
could be coated with P8 and successfully infect spheroplasts 
(33). In this manner it was possible to prepare virus with 
genomic deletions or additions, particularly with reporter 
groups such as kanamycin resistance orlaccn (35). 
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Once virus had been prepared with stable, nonreverting 
deletions, it was possible to test for the ability of live vims to 
acquire plasmid transcripts as replacements for defective 
genomic segments. It was found that transcripts that con¬ 
tained the full sequence content of a genomic segment, 
even if bounded by nonviral sequence at the 5' and 3' ends, 
could result in precise acquisition of the viral sequence (28). 
In cases where the transcript did not contain the viral 3' 
sequences, they could still be acquired; however, in these 
cases the acquisition involved RNA recombination so that 
the transcript could acquire a copy of a 3' end from one of 
the other genomic segments. This process is called heterolo¬ 
gous or nonhomologous recombination (26). In the case of 
4>8, it was found that transcripts that did not contain the 
5' pac. sequences could be acquired through a process of 
homologous recombination (38). 

But the easiest and most efficient procedure for reverse 
genetics was found to be electroporation of cDNA plasmids 
of the three genomic segments into cells that contain either 
SP6 polymerase or T7 RNA polymerase (48). In these cases 
the transcripts, using SP6 or T7 promoters, start at the 
beginning of the viral sequences but the 3' ends are bounded 
by nonviral sequence. The resulting phage has correct 5' and 
3' terminal sequences. Using this technique it has been 
possible to prepare mutant forms of bacteriophages 4>6> 
<]> 8, and 4>13. For <f>6 and <f>13, where the 5' sequence begins 
with GG, T7 polymerase is preferred for the in vivo trans¬ 
cription, while for 4>8, which begins with GA, SP6 poly¬ 
merase is preferred although T7 polymerase works at a 
lower efficiency. 

RNA-Dependent RNA Polymerase 

The RNA polymerases of the Cystoviridae are part of the 
inner core particle. There seem to be 12 molecules per 
virion and they are probably located under the 5-fold 
vertices. The polymerases have some ill-defined sequence 
requirements for template usage, but they do not determine 
the specificity of packaging (24, 30, 53). In <j>6 there is a differ¬ 
ence in (+) strand synthesis between segment L and the 
other two segments. This is due to a sequence difference at 
the 5' end of the (+) strand in that segment L has GU while 
S and M have GG. The S and M templates support (+) strand 
synthesis at about 10 times the rate of L and this is corrected 
by changing the 5' terminal sequence of L to GG (8). This 
difference might be important for the temporal control of 
expression, in that the genes for segments S and M are 
expressed late in infection when their transcription is much 
greater than that of L. However, bacteriophage (j>8 has no 
difference in the 5' base sequence between the segments. In 
vitro, c()8 transcription is about equal for all three segments, 
while that for cj>6 is heavily biased toward S and M. 

The polymerase of 4>6 has been purified and crystal¬ 
lized (3). The crystals were of a high quality and an atomic 


structure of the molecule was obtained through X-ray 
diffraction studies with 0.2 nrn resolution (figure 16-9) (2). 
This was the first atomic structure for a polymerase of a 
double-stranded (ds) RNA virus. The structure is of special 
interest in several regards. It is remarkably similar to that of 
the hepatitis C virus, which is quite similar to that of polio¬ 
virus. Structures of complexes with nucleotides and oligo¬ 
nucleotides were also obtained. These structures have led to 
a model for the initiation of (+) and (—) strand synthesis (2). 
The polymerase has separate channels for substrate nucleo¬ 
tide triphosphates and for template. The polymerase can 
synthesize (+) strands from a dsRNA substrate with the 
concomitant unwinding and exclusion of the parental (+) 
strand, which does not enter the polymerase. It can also 
synthesize (—) strands on the template of single-stranded 
(+) strand RNA. In both cases the dsRNA leaves the poly¬ 
merase through a channel that is normally blocked by the 
C-terminal domain of the polymerase. The initiation of poly¬ 
merization involves the pairing of an NTP, usually GTP, with 
the second nucleotide from the 3' end of the template. The 
template then backs up so that another nucleotide can pair 
with the first nucleotide from the 3' end of the template. 
Polymerization then begins and continues with the extru¬ 
sion of the dsRNA product from the polymerase. 

The polymerase in 4>6 procapsids does not begin RNA 
synthesis until the packaging of all three segments is 
complete. There seems to be control over the activity of the 
polymerase. However, purified polymerase is able to carry 
out (—) strand and (+) strand synthesis on various templates 
(53). The polymerase does not have great specificity in terms 
of the templates that it will use. The primary instrument for 
specificity in these viruses is the packaging mechanism. 
However, the polymerase does distinguish between the 
dsRNA substrates for transcription. In most members of the 
Cystoviridae, there is a difference in the second nucleotide of 
the 3' end of the (—) strand. In 4>6 the sequence of the (—) 
strand is 3'-CCUUU in segments S and M, while it is 
3'-CAUUU in segment L. This difference results in transcrip¬ 
tion activity for M and S being 10-20 times that of L. This 
difference is probably important in the regulation of the 
temporal expression of the genes of the various segments. 
Those of segments S and M are expressed at late times in 
infection, while those of L are expressed early. The distinc¬ 
tion between early and late gene expression seems to be a 
combination of the message stability of L and the overpro¬ 
duction of S and M. Apparently, at early times the S and M 
messages are effectively degraded so that they are not trans¬ 
lated. Bacteriophage 4)8 is the only member of the family 
that does not have a sequence difference at the 5' end of the 
(+) strand between segment L and the others. It seems that 
4)8 uses a completely different method of temporal control. 

Another problem for the polymerase is the choice 
between the 3' of the (+) or the (—) strand in dsRNA for the 
transcription template. The polymerase normally chooses 
the 3' end of the (—) strand because this sequence is very 
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Figure 16-9 The structure of 4>6 polymerase and comparison with hepatitis C virus (HCV) polymerase. A: Stereo image 
showing secondary structural elements. B: Comparison of <j>6 and HCV polymerases (2). See thebacteriophages.org/ 
frame_0160.htm for a color version of this figure. 


high in A and U in all the members of the family. The ends 
should be able to melt out very easily. It has also been 
suggested that the polymerase prefers the sequence of (—) 
strand 3' end for reasons apart from its low melting 
behavior (53). 

Recombination 

Recombination of RNA molecules has been demonstrated 
in almost all viral RNA systems. The mechanism of RNA 
recombination in viruses is template switching (23). The 
nascent chain detaches from its original template and then 
associates with a new template along with the original poly¬ 
merase molecule or with a new polymerase. Recombination 
frequency is increased when one of the packaged (+) strands 
lacks a proper 3' sequence and consequently cannot serve 
as a template for normal (—) strand synthesis (43). Most 
recombination seen in the Cystoviridae is heterologous in 
that the nascent chain seeks out a template with only about 


3 nucleotides of identical sequence (26). However, <f> 8 is also 
capable of homologous recombination wherein the recom¬ 
bination occurs within a region of extensive sequence iden¬ 
tity. Homologous recombination in that case requires about 
600 nucleotides of identical sequence (38). It seems likely 
that heterologous recombination is more useful to these 
viruses than homologous recombination. The probability 
of packaging two molecules of the same segment class is 
very low, so homologous recombination would be a very inef¬ 
ficient means of correcting defective sequence. In these 
viruses, reassortment of segments occurs at high frequency 
and would be an effective means of sequence correction or 
change (5). The value of heterologous recombination would 
be in the acquisition of completely new sequence either from 
other viruses or from the host cell transcripts. In addition, it 
is a rather efficient means of correcting 3' terminal deletions, 
which might be rather significant due to cellular nucleases. 
Packaging is solely dependent upon the integrity of the 
5' end, so it is possible to package molecules with defects at 
the 3' end; these could be corrected by template switching 
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from one of the other genomic segments. Although most of 
the evidence for recombination in the Cystoviridae is the 
result of experimental manipulation, the sequence of the 
3' end of segment M in 4>13 is striking in that it is an exact 
copy of the 3' end of segment M of 4)6 although the rest of 
the segment has little sequence similarity (46). This 
sequence is also very different from that of the other two 
segments of 4>13. It seems likely that the 4>13 M terminal 
sequence is the result of a recombination event. 

Sequence Differences Among 

the Cystoviridae 

The members of the Cystoviridae are very similar in struc¬ 
ture, but there are significant differences between many 
of them in terms of nucleotide and amino acid sequence 
as well as in some structural aspects (figure 16-10). The 
genomic segments are of approximately the same size in the 
members of the family. The sizes for 4>6 are 2.9, 4.1, and 
6.4 kbp respectively for S, M, and L. The arrangement of 
genes is rather similar as well. Segment L encodes the 
proteins of the inner core: PI, P2, P4, and P7. In 4>13 gene 7 
is the first gene in segment L; in 4>6 there is a gene preceding 
gene 7 that is not essential for phage development. It is desig¬ 
nated as gene 14. Some of the very close relatives of 4)6 such 
as 4>10 have the same arrangement, while others such as 4>7 


and 4>9 have an additional gene, designated as orfE, preced¬ 
ing gene 14. These genes do not perform an essential func¬ 
tion but may be involved in regulating the expression of 
gene 7 in a nonessential manner. 4)12, which is rather 
distant from 4>6, has two genes preceding gene 7. 4>8. which 
is the most distant member of the family, has a gene 14. 
However, 4)8 differs from all the members of the family in 
that the ortholog of gene 7 is found not near the 5' end of L 
but rather at the 3' end of segment L. The gene in the usual 
position of 7 has been renamed as gene H (49). The protein 
coded by gene H is not a component of the inner core, but 
does seem to play a significant role in the virus life cycle. 
Deletion of gene H results in a virus that forms very small, 
turbid plaques: however, mutants appear that are able to 
form plaques of almost normal size. It is not clear what role 
this protein plays. 

4)7, 4>9, and 4>10 have very similar sequences to those of 
4)6. Base sequence identity is very high with most changes 
in the third base of codons. Phages 4)8. 4>12, and 4)13 are 
very different from the others and from themselves in base 
sequence and even in amino acid sequences. 4)13 is the 
closest of the three to 4)6, yet the amount of amino acid 
sequence identity is 50% for the polymerase P2, while PI, 
P4, and P7 show about 30% identity. These figures can be 
contrasted with those for 4)8 where there is almost no 
similarity in proteins PI, P4, and P7 and the polymerase 
shows only 20% identity with the polymerase of 4>6. 
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Figure 16-10 Similarities in cystovirus genes. 4>8 and 4>12 share about 20% amino acid identity in P2, while 4>13 has 50%. 4>8 
shares some sequence similarity in gene 4 with 4>12, but nothing else. 4>12 shows some similarity in the host attachment 
proteins to 4)13 as well as similarity in gene 4. 4>12 also shows similarity in the lysis genes 5 and 70 with the 4>6 family. 4>13 
has 30% amino acid identity with 4>6 in the inner core protein genes and genes 8 and 72, and a striking identity in base 
sequence for the 3' end of segment M. See thebacteriophages.org/frames_0160.htm for a color version of this figure. 
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4> 12 polymerase shows 21%, 24%, and 20% identity to the 
polymerases of <j>6, 4>8 and c|>13. Protein P4 of <j>12 shows 
27% sequence identity to P4 of 4>8, and cf>13 (11). 

There are two major structural distinctions within the 
family Cystoviridae. The first is that several of the viruses 
attach to the host LPS rather than to a type IV pilus. Whereas 
4>6 has an attachment apparatus consisting of the poly topic 
membrane protein P6 that holds the attachment specificity 
protein P3, <j)8, (j)12, and <j>13 have a more complex attach¬ 
ment apparatus consisting of several P3 proteins such as 
P3a, P3b or P3c. The second distinction is that all the 
phages except <j)8 have an intermediate shell of protein P8 
that covers the inner core and is involved in membrane 
acquisition and host cytoplasmic membrane penetration. 
In <f>8, the protein P8 is a component of the viral membrane 
and the inner core is able to penetrate the host cytoplasmic 
membrane on its own and it is also able to acquire the viral 
membrane directly. 

Although the attachment proteins are completely differ¬ 
ent in <j)6 and <j)13, it is possible to substitute the M segment 
of (j)6 for that of (j)13 (46). Even though the pac sequences 
are completely different, the pac sequence of <f>6 is able to 
be recognized by <f>13 and to be maintained in a stable fash¬ 
ion. 4>6 will not accept the pac sequence of <f>13; however, the 
genes of 4>13 are acceptable if the genomic segment is sup¬ 
plied with the pac sequence of <j)6. The resulting membranes 
are an interesting melange of the two systems. Protein P9 
is the major membrane protein and is coded by segment S, 
protein P10 is an important component of the membrane 
and it would be coded by segment M along with P6 and 
the P3 assortment. Membrane assembly is determined pri¬ 
marily by protein P 9 and the morphogenetic protein P12 (18). 
This membrane would have to interact with the shell of P8. 
These three genes are on segment S. The question is how 
the attachment apparatus of P6 and P 3 recognizes the mem¬ 
brane composed of a heterologous P9. In any case, when cj>6 
uses the attachment proteins of <f>13, it does not attach any 
more to pilus, but does attach to rough LPS. 

4>13 shows 28%, 22%, and 22% identity to the c(>6 
proteins P6, P8, and P12 respectively. <j)12 shows significant 
identity to the proteins P6 and P 3a, b, and c of 4>13. These are 
the host attachment proteins. The lysis proteins, P5 and P10, 
of (j)12 show 30% and 62% identity to proteins of <j)6. 

It is striking that cj>12 has acquired or maintained the 
lysis cassette that is similar to that of (j)6 even though the 
two genes are on separate chromosomes. It is also striking 
that <j)12 has maintained the attachment gene set that is 
typical of the LPS binding group (15). We have found that 
4> 12 can acquire the attachment genes of <j)6. It might be 
that (j)12 originally had both the attachment and lysis 
genes of 4>6. It seems apparent that there has been, and 
probably still is, considerable genetic interaction between 
the members of the Cystoviridae. The most dramatic 
example is the complete identity of the 3' end of <j)13M to 
that of <j)6. 


The stringency of the genomic packaging specificity 
serves to maintain the high efficiency of plating of the 
Cystoviridae but, at very low frequencies, the viruses are 
able to package RNA with imperfect pac sequences or no pac 
sequences at all (38). These RNA molecules cannot usually 
be maintained, but their sequences can be copied into 
the normal genomic segments by recombination. In this 
manner, these phages can exchange materials and acquire 
genetic information from unrelated sources. 

The Relationship of the Cystoviridae 
to the Reoviridae 

The members of the Cystoviridae have three double-stranded 
RNA genomic segments that are contained within a core 
structure composed of 120 copies of its major protein 
species. The reoviridae have 10, 11 or 12 double-stranded 
RNA genomic segments that are also contained within a 
core structure composed of 120 copies of their major protein 
species (16). The similarity in structure is striking, although 
the Reoviridae do not have a component like the P4 NTPase 
on the inner core particle. Rotavirus does have a nonstruc- 
tural protein, NSP2, that may play a similar role; however, 
genomic packaging in the Reoviridae is not yet elucidated (50). 
As expected, the Cystoviridae transcripts are polycistronic 
whereas the transcripts of the Reoviridae are primarily 
monocistronic as found for most viruses of eukaryotes. 
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The Dark Side of T1 Phage 

Although T1 got its initial fame from its use, in 1943, by 
Salvador Luria and Max Delbriick in their landmark fluctua¬ 
tion test (64), an entirely different reason has resulted in its 
notoriety since then. This phages resistance to desiccation 
results in its persistence for weeks in an aerosol which in 
turn created a real nightmare for laboratories engaged in 
research employing E. coli K-12 strains. It was not uncom¬ 
mon to hear cries from laboratory workers that their 
overnight bacterial cultures contained only floating debris. 
This was the result of an onslaught by Tl. Laboratories 
were quarantined for days until the airborne count of Tl 
vanished. Bacterial geneticists have had Tl-resistant E. coli 
tonA mutants since the 1940s, and as a result many labora¬ 
tories protected against unwanted invasion of Tl by intro¬ 
ducing null tonA alleles into their E. coli strain collection. 
Even today Tl maintains its bad reputation because it 
remains a problem, particularly in laboratories that only 
casually use E. coli strains for experiments involving 
cloning, protein overproduction, and analysis of genomic 
libraries. Because of this, almost all commercially available 
transformation-competent E. coli cells now contain a tonA 
mutation. Despite the knowledge of Tl-mediated calamities, 
many laboratories continue to work with E. coli strains 
that are not tonA. Apparently, it often takes a large-scale 
disaster for laboratories to become believers in the inci¬ 
dental havoc Tl can wreak on innocent bacterial cultures. 
Despite its infamy, phage Tl is an interesting phage in its 
own right. In this chapter we consider the biology of the 
Tl-Iike phages, concentrating on phage Tl and the related 
phage TLS. 

What’s in the Name? 

Phage TLS recently went through a name change due to its 
perplexing history (37). Prior to 2001, it was known as U 3 in 
several laboratories across North America and Australia. 


But there were clearly two distinct strains of U 3, as is evident 
from analysis of phage-resistant mutants. The U 3 strain used 
in the laboratories of Carl Schnaitman (5, 74) and Rajeev 
Misra (58, 104, 105) yielded phage-resistant mutants that 
concurrently displayed a classical "deep-rough” phenotype 
(hypersensitivity to antibiotics, dyes, and detergents) of lipo- 
polysaccharide (LPS)-defective mutants. 

A different U3 strain, which also requires LPS as its recep¬ 
tor, was used by Malcolm Casadaban’s laboratory. In their 
analysis of LPS mutants, the U3 r phenotype did not always 
correlate with the hypersensitivity phenotype (94). The fact 
that the two strains of U3 had different receptor specificities 
was evident from biochemical analyses of mutant LPS (74, 
94). AU3 strain used in Paigen and Beacham's laboratories 
appeared to be similar to that used by Casadaban’s (78, 111). 
They, like Casadaban’s laboratory, reported the requirement 
of a galactose residue in the LPS core for infection. Watson 
and Paigen also showed that their U3 strain is a small, tail¬ 
less phage resembling (j)X174 in its physical properties, and, 
unlike our U3 strain, infected E. coli K-12 but not E. coli B 
and C strains (111). As a result, this U3 strain was classified 
as a member of the genus microvirus (single-stranded DNA 
genomes) in the viral family Microviridae. 

Based on host range, receptor specificities, structural 
features, and genome composition, the U3 strain used in 
Carl Schnaitman and our (Misra) laboratories is indisputably 
a different phage (with a flexible tail (figure 17-1) and a 
double-stranded DNA genome) than the one used by the 
Casadaban, Paigen, and Picken laboratories. This obliged 
a name change for the U3 strain reported later in the 
literature. Consequently, the IJ 3 strain used by us, and 
described here, was re-named TLS, for TolC- and LPS-specific 
phage (37). 

What Is a Tl-Like Phage? 

Using a rigid phenetic approach, the International Council 
for the Taxonomy of Viruses (ICTV) classifies phage into 
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Figure 17-1 Electron micrographs of phage T1 (47; with 
permission) and phage TLS. 


genera based upon analysis of ultrastructure, host range, 
lytic or temperate nature, genome size, and packaging pecu¬ 
liarities (2). On the basis of these criteria phage T1 has been 
classified as a lytic member of the Siphoviridae family (long 
noncontractile tails) specific for Escherichia coli and Shigella 
strains. Further distinguishing characteristics of T1 are 
an extremely flexible tail (151 nm in length by 8 nm in 
width) and an icosahedral head about 60 nm in diameter 
(figure 17-1). One complementary ICTV characteristic worth 
mentioning is the size difference in the two predominant 
head subunits between X (38 and 53 kDa) and T1 (26 
(suspected) and 33 kDa). The last characteristic of the 
Tl-like phage genus is that the members have a genome 
less than 60 kb and have pac sites as opposed to cos sites. 
Headful packaging from a pac site creates genomes that are 
terminally redundant and partially circular permuted (81). 

While there are over 50 phages that could be considered 
relatives of T1 based on morphological evidence, the ICTV 
has tentatively classified only the following enterobacterial 
phages into the Tl-like group: 102, 103, 150, 168, 174, (34, 
D20, <f>y, Hi, and UC-1. The TolC-specific phage TLS is 
under consideration by the ICTV for proper species status 
within this genus. 

Another phenetic approach for classification as Tl- 
similar (“similar” to delineate it from the Tl-like genus 
description) phage could use a loose definition based on 
mutual recombination or packaging. This is in relation to 
the term “lambdoid,” which is reserved for phages that can 
recombine with X (e.g., HK97, P22, and HK620) regardless 
of their morphological characteristics or ICTV taxonomy. 
In a study by Hug and colleagues (47) a T1 type 3 variant 
lacking the ability to synthesize its own DNA was able 
to package other phages at different rates. The DNA from 
Hi, 150, 168, 172, and KD9 phages was packaged nearly as 
efficiently as T1 control DNA. Interestingly, D20 did not 


show efficient packaging even though it shares a highly 
similar pattern of virion proteins on denaturing polyacryla¬ 
mide electrophoresis (SDS-PAGE) and, like phage 103, was 
the only other phage inactivated by anti-Tl serum (50). 

The lack of sequence data has prevented taxonomic char¬ 
acterization using a phylogenetic approach like that seen for 
a highly conserved head subunit gene which created a T4 
type group (109) or a proteomic comparison approach 
which depends on fully sequenced phage genomes (93). 

T1 Adsorption and DNA Entry 

The outer membrane components, including proteins and 
LPS, are often exploited by phages to gain entry into the 
bacterial cell. The early work by Theodore Puck and cowor¬ 
kers demonstrated that phageT1 requires 1() -3 M divalent or 
10 -2 M monovalent ions for optimum attachment to its host, 
E. coli B. The adsorption rate constant (~ 3 x 10 -9 cm 3 
min -1 ) is extremely high, suggesting that most collisions 
lead to infection. While incubation at 37 °C leads to irreversi¬ 
ble adsorption, incubation at 2 °C results in phages which 
are reversibly associated with the host cell. In order to inves¬ 
tigate phage attachment further Tl-resistant mutants of 
E. coli strain B were isolated and found to fall into two 
distinct classes. The first type referred to as tonA mutants 
(T one : lacking in strain B/1,5) do not allow any phage 
attachment, while the second class, tonB, permitted reversi¬ 
ble but not irreversible adsorption (35, 79). The gene for the 
former phenotype, now referred to as flmA, specifies a gated 
outer membrane (OM) protein involved in the transport of 
Fe J+ -ferrichrome. This protein also functions as the surface 
receptor for the phylogenetically unrelated phages (Tl, 
T5, c()80, UC-1, and ES18) and the bacteriocins, colicin M 
and microcin J25 (12). The monomeric outer membrane 
protein FhuA has been crystallized (32) revealing a two- 
domain protein in which residues 160-714 form a (3-barreled 
structure in the outer membrane with the N-terminal 
159 residues forming a globular periplasmic plug or cork. 
The latter can be deleted with little change in the in vivo 
protein functions including ferrichrome uptake or phage 
attachment (11). 

Random and site-directed mutagenesis of the surface- 
exposed FhuA peptide loops revealed that insertions in 
loop 4 (316-356) or loop 7 (502-515) influenced the phage 
sensitivity/resistance phenotype in an insertion-dependent 
manner (53). The insertion of four amino acid residues into 
FhuA loop 4 resulted in cells which were fully resistant toTl 
and displayed reduced sensitivity to 4>80. Insertions in loop 
7 were fully resistant to colicin M. Subsequently, Killman 
and colleagues employed overlapping acylated hexapeptides 
in infection-inhibition assays demonstrating that the recep¬ 
tor specificities of the three phages overlapped to some 
extent, with Tl specifically recognizing residues 3 3 APADK- 
GHY 340 and 347 VDDEKL0 353 (52). Since 340 Y is involved in 
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Figure 17-2 Genomic maps of T1 (above) and TLS (below). 


ferrichrome binding these studies explain the old observa¬ 
tion that ferrichrome can inhibit T1 adsorption (100). 

Another Tl-like phage, Cl (63), utilizes the OM vitamin 
B 12 transport protein BtuB as its receptor, a feature it 
shares with phage BF23 (6) and the E group colicins 
and colicin R (61). Unlike Tl, Cl phage infection is TonB- 
independent, requiring instead the inner membrane 
DcrA (SdaC) protein and the tethered-periplasmic protein 
DcrB. These are thought to initiate the creation of a DNA 
injection-competent channel (95). 

UV-irradiated host cells allow reversible but not irreversi¬ 
ble adsorption (35) indicating that infection is an energy- 
dependent process, a fact confirmed by the observation that 
TonB, which is involved in energy-translocating processes, 
interacts with FhuA in the periplasm (51). Phage Tl adsorp¬ 
tion leads to the rapid loss of cellular ATP, K + , and depo¬ 
larization of the cell membrane (Av|/). In the presence 
of high Mg 2+ this phenomenon was transient, that is, the 
membrane pores became resealed (50). The DNA enters 
the cell in an oriented manner (39), which we interpret as 
being from the right-hand end of the genetic map presented 
in figure 17-2. 

The TLS Receptor 

Work carried out in Carl Schnaitman’s laboratory reported 
that mutations in tolC and rfa (now called waa, 84) 
could confer resistance to phage TLS, indicating that 
both TolC and LPS are required for TLS infection (5). The 
waaP gene product, which phosphorylates the first heptose 
residue in the LPS core (98, 119), was found to be essential 
for TLS infection (74, 94, 104). The role of surface-exposed 
regions of TolC in TLS infection was recently confirmed 
through analysis of tolC missense mutants (37). Attempts 
to obtain mutations mapping in loci other than tolC and 
waaP have failed, thus reflecting the absence of any other 
nonessential cellular factors needed for TLS infection. 


Because the number of LPS molecules on the bacterial 
cell surface is much greater than that of the TolC protein, 
it may be that LPS serves as an initial source of phage inter¬ 
action. A reversible binding of TLS to LPS would then 
allow the phages to glide across the cell surface until they 
find TolC to initiate irreversible binding. If this were the 
case, overexpression of TolC in a LPS mutant (deleted for 
waaP) might circumvent the requirement for LPS. However, 
this was found not to be the case, showing that the require¬ 
ment of LPS with its normal core is absolute. It also 
suggested that the role of LPS might not be simply to allow 
phages to find TolC but rather to serve as an essential compo¬ 
nent in the infection process. 

Genomics 

With the completion of the sequences of Tl (91) and TLS 
(38) many of the secrets of how these phages behave 
has been revealed. The unique sequence of the termi¬ 
nally redundant and circularly permuted Tl genome is 
48,836 bp. Since the total mass of the virion DNA, based 
upon PstI digestions, is 50.7 kb, the terminal repeats are 
1.9 kb. This is less than the 2.8 kb previously suggested, and 
the genome is larger than the earlier estimates of 46.9- 
49.5 kb (27, 65, 89). In the case of TLS the unique sequence 
is 49,902 bp with 1 kb terminal repeats, for a total genome 
size of 50.9 kb. The overall base compositions of the two 
phages are 45.6 mol%G-C (Tl DNA) and 42.7 (TLS), which 
are both less than that of the host (51 mol%G-C). The A-T 
content profile exhibits two spikes: the first is located 
within the predicted left terminal repeat while the other is 
found downstream of the tail spike genes. The latter 
suggests possible lateral transfer of this specific region of 
the genomes. The difference in G-C content between the 
two phages is also reflected in the fact that there is remark¬ 
ably little perfect sequence identity between the two 
genomes. 
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As with other phage genomes, the genes of these 
two phages are densely packed (figure 17-2). One unusual 
characteristic of the genes of these phages is the high 
percentage of small open reading frames (ORFs), which 
were particularly prevalent at the ends of the genome. 
TheTl genome contains 77 ORFs while that of TLS contains 
86 (figure 17-2). Approximately one half of the phage 
proteins gave BLAST hits, but with few exceptions these 
showed a low degree of sequence similarity to their “homo¬ 
logs.” The exceptions were the genes that we assume are 
involved in tail assembly, which showed >40% sequence 
identity, usually to corresponding proteins from coliphage 
N15 (83). The other gene showing a high degree of related¬ 
ness is that which encodes a lysozyme-like protein. The two 
phages shared 52 genes in common, with their proteins 
having 26-77% identical residues (average amino acid 
identity 53%). 

Restriction and Modification 

It has long been known that T1 DNA is insensitive to EcoBI 
[TGA(N8)TGCT] and EcoKI [AAC(N6)GTGC] type I restric¬ 
tion endonucleases, but is sensitive to EcoPI [AGACC] type 
III restriction endonuclease. In silico restriction analysis 
revealed that both T1 and TLS DNA lack Haml II, Kpnl, SacI, 
SocII, or SphI sites and have a statistical underrepresentation 
of other common restriction sites. In addition, T1 DNA does 
not contain EcoBI or EcoKI sites, answering the old question 
about how this phage escapes the common type I restriction 
endonucleases present in its hosts. On the other hand, phage 
TLS DNA has 13 EcoBI sites and a single EcoKI site. 

The TLS gene 5 is highly similar to the stp gene found 
exclusively in T4 and T-even phages. T4's Stp is a 29 amino 
acid peptide which provides protection from host type I 
restriction enzymes including EcoPrrl. Stp is considered 
a “double-edge” sword during phage infection because 
it prevents DNA restriction at the expense of activating 
an endonuclease specific for the host tRNA Leu anticodon 
(an ACNase) (77). Restriction of the anticodon furnishes an 
atypical cleavage that is difficult for the host to repair 
and leads to a nonproductive infection as well as sacrificial 
death of the cell. T4 infection circumvents the ACNase 
activity by producing Pnk (polynucleotide kinase) that 
reverses the unusual cleavage, pnk homologs are found 
in TLS (gene 9) and T1 (gene 64) genomes, albeit only the 
DNA coding for the equivalent C-terminal half of T4 Pnk 
that is responsible for phosphatase activity (33). 

The only two genes that had been cloned from T1 were 
dam (N-6-adenine methyltransferase) and a small putative 
downstream gene called HP83 (99). The complete genomic 
sequences of T1 and TLS reveal that they both encode 
Dam methylases, whose significance is unknown since 
E. coli also expresses DNA adenine methylase activity. Other 
phages specific for Dam-positive bacteria such as coliphages 


T2, T4, and RB49 (26), Haemophilus phage HP2 (115), and 
Shigella phage SfV (4) also encode a copy of this apparently 
redundant protein. Lastly, TLS has a gene specifying a Dcm 
methylase, a feature which it shares with enterobacterial 
phages a 15 (NC-004775) and N15 (AAC19095). E. coli also 
possesses a methylase which adds a 5'-methyl group to the 
internal cytosine residues in CCAGG and CCTGG. 

Transcription and Translation 

Using pulse-labeling with 14 C-labeled amino acids Wagner 
et al. (110) demonstrated three classes of protein expression: 
early proteins (e.g., helicase, primase, and recombinase; 88), 
whose synthesis was shut off at the onset of DNA replication 
(108), an early-late class produced throughout the infective 
cycle, and a late class (e.g., structural proteins and lysozyme: 
108) whose onset of synthesis was delayed. This suggests 
temporal regulation of late transcriptional or translational 
events, which was confirmed by an inability to detect late 
proteins in an in vitro coupled transcription/translation 
system (108). 

Early transcription in coliphages belonging to the 
Caudovirales (1) normally involves host RNA polymerase 
recognition of promoter sites which are defined by the 
presence of the canonical hexamers (—35 TTGACA; —10 
TATAAT) optimally separated by 15-19 bp (69). Transcript 
elongation terminates by rho-dependent or rho-independent 
mechanisms. Relatively little is known about transcription 
or temporal gene expression in Tl. Unlike phages, such 
as those of the T7 group, which specify single-subunit 
RNA polymerases, transcription in Tl-infected cells is fully 
dependent on the host RNA polymerase (109). Furthermore, 
neither Tl nor TLS contains genes homologous to RNA poly¬ 
merase subunits. In the only study of Tl transcription, 
Gawron and colleagues (36) hybridized J “P-labeled RNA, 
isolated early and late in phage infective cycle, to the alkali- 
denatured, electrophoretically separated strands of Tl DNA. 
They showed that while most of the hybridization occurred 
to the slower-migrating strand, at early times post-infection 
transcripts were observed to hybridize to the faster- 
migrating stand. Examination of the arrangement of genes 
on Tl/TLS reveals that the bulk of the transcripts will be 
transcribed from the bottom strand, except early in 
infection when some transcript will come from the comple¬ 
mentary strand. Early transcription probably involves diver¬ 
gent transcription of the helicase-dczm and primase-tail 
fiber gene clusters resulting in DNA synthesis. 

In both phages, the majority of putative late transcription 
would, upon first glance, appear to occur in a single block 
from left to right resulting in expression of the genes 
involved in packaging and morphogenesis. This modality 
is found, with minor variations, in all other phages. Closer 
examination reveals, particularly in the “late operon,” 
that Tl and TLS both contain an unusually high content 
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Figure 17-3 Consensus sequence logos of the T1 and TLS 21 nucleotide direct repeats. Produced using WebLogo 
(http://www.bio.cam.ac.uk/cgi-bin/seqlogo/logo.cgi). 


of rho-independent terminators. In coliphage T4 these sites 
frequently contained a UUCG or GNRA loop sequence 
(70) that none of the putative Tl/TLS terminators possess. 
The presence of transcriptional terminators led us to 
speculate that either a transcriptional read-through system 
exists such as that which occurs in lambdoid phage HK022 
(48) or that downstream promoters must exist to direct 
transcription of subsequent genes. In both phages we have 
identified the latter (figure 17-2). 

In addition both phages possess numerous 21 nucleo¬ 
tide direct repeats located in the intergenic regions or 
overlapping the translational terminators of the preced¬ 
ing gene. In the case of T1 we have identified 20, while 
TLS possesses 19. While their sequences (figure 17-3) are 
reminiscent of UP-elements in E. coli (54), their positions 
suggest that they may function in a manner equivalent 
to eukaryotic enhancers. This suggests a far different 
approach to transcriptional regulation than has been seen 
with other phages such as the two common modalities 
exemplified by the lambdoid phages and T7. In the former 
case, a single promoter regulates transcription of the 
morphogenesis genes, while with the latter phages multiple 
transcriptional start sites are located in the late region. 
TLS and T1 appear to divide the late region up into a series 
of transcriptional modules (transcriptons) flanked by rho- 
independent terminators and containing RpoD-dependent 
promoters and perhaps enhancers. This molecular approach 
may account for the short latent period of 13 minutes 
observed with coliphage T1 (9, 24, 90). 

T1 infection causes a rapid cessation of host protein 
synthesis initially, presumably as a function of the effect 
of the phage of membrane function. But infection also has a 
lasting inhibitory affect on the translation of existing host 
mRNA (109). The reason for this is unknown. 


Frameshifting 

Programmed frameshifting is a phenomenon associated 
with translation, in which a stalled ribosome usually 
slips +1 or —1 and continues translating the message 
resulting in a protein with an altered C-terminus (3, 41). 
The slippery site AAA AAA GAG (LysLysGlu) occurs in 
orfll. This gene has a downstream stem-loop structure 
which would form a pseudoknot. Translational slippage 
would result in a slightly larger protein terminating in 
kheKKGGVSLLSYLYLLT rather than kheKKEA. In phage 
TLS another example of frameshifting occurs in tfmA speci¬ 
fying a minor tail protein, which results in a fusion with 
the downstream gene tfmB. A —1 frameshift at the slippery 
site GTA AAA AAC (ValLysAsn) results in a fusion protein 
VK(TfmA)-KKLKRAVYLYYQKPPTDAELQAVGLTRADYEGE 
DPPEVIFDSS-(TfmB). 

Host Genome Degradation and Phage 
DNA Replication/Recombination 

In the following sections we will discuss the genetic 
organization of T1 from a modular perspective. T1 and 
TLS encode several proteins which play roles in DNA 
metabolism including its degradation, synthesis, and 
recombination. After phage infection the host DNA is 
degraded and approximately two thirds of its phosphorus 
content is recovered in progeny viruses (56, 57). In spite of 
this, host genome degradation is not essential for phage 
development (90). The mechanism of host DNA degradation 
is not understood. Interestingly, mutants in an early gene 
(previously designated 2.5) are able to block host DNA 
degradation without blocking phage DNA synthesis (89,90). 
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How this functions to regulate phage or host degradative 
enzymes is not immediately apparent. Our analysis of the 
phage genomes reveals several genes potentially involved 
in nucleotide metabolism, including those having homol¬ 
ogy to two phage kinases (pnk, dmk). In the case of pnk, 
TLS and T1 homologs are labeled PnkP because the trun¬ 
cated proteins only match to the C-terminal domain of 
T4 s Pnk that codes for the phosphatase domain. 

Initiation of DNA replication involves the coordinated 
accumulation of replication-associated proteins at the 
origin of replication (Ori). In the case of E. coli this entails 
the initial binding of the replisome-organizer protein DnaA 
and the subsequent localized melting of the DNA duplex 
(16, 17). DnaA permits recruitment of DnaB (replicative 
DNA helicase) (101) or a complex of DnaB with the helicase- 
loading protein DnaC. Other proteins added to the replica¬ 
tion complex include clamp-loading protein (DnaX; PolC 
proteins x and y), clamp-binding protein (DnaN), DnaG 
(DNA primase), single-strand binding protein (SSB), DNA 
gyrase, HU, integration host factor (IHF), and DNA poly¬ 
merase III (PolC). The replication complex coordinates 
simultaneous synthesis on the leading and lagging strands 
(13). In the case of temperate phages P22 and X, gpO 
functions as the replisome-organizer, recruiting to the 
origin of replication a loading factor (gpP) in the case of X 
or gp!2 helicase with phage P22. GpP binds to both gpO 
and the host replication helicase (DnaB) resulting in enlar¬ 
gement of the replication bubble (117). 

The work of Bourque and Christensen (10), employ¬ 
ing host temperature-sensitive DNA replication mutants, 
showed that DNA polymerase III (PolC), primase (DnaG), 
and DNA clamp loader (DnaX) were required for T1 repli¬ 
cation, while DnaA, DnaB, DnaC, and DnaT (primasomal 
protein i) were not. Other phage whose replication is inde¬ 
pendent of host DnaA protein include P22 (96), P4 (103), 
and SPP1 (75). There are conflicting reports on whether 
DnaG, the primase for the lagging strand, is required. DNA 
replication likely requires transcription for either torsional 
stress or priming of DNA, because rifampicin, a transcrip¬ 
tional inhibitor, also inhibits DNA replication. 

Early T1 phage genetics and protein analysis showed 
that two phage genes, specifying proteins of 38 and 65 kDa, 
respectively, were required for DNA replication (88). Amber 
mutations in either of these genes gave a DO phenotype, 
that is, they were defective in DNA synthesis. The 38 and 
65 kDa proteins are probably equivalent to the products 
of theTl/TLS priA and hclA genes, respectively. The presence 
of an ATP-dependent helicase explains the growth of T1 
on dnaB ts hosts (10). Helicases function to unwind double- 
stranded nucleotides in a 5'—^3' or 3'—>-5' direction and are 
classified into five superfamilies of which Tl/TLS HelA 
proteins are members of superfamily II (C0G1061). Each 
of the proteins in these superfamilies contains up to 
seven identified motifs. In both phages, HelA protein Motif I 
(G 62 KT 6 ) most probably corresponds to the Walker A box 


which is, along with the Walker B box (D 148 ECH 151 , Motif 
II), involved in MgATP interaction. Motif VI could be 
Q 413 LLGRGMR 420 (Tl) or Q 404 LLGRGMR 411 (TLS), which 
bear more than passing resemblance to QTIGRAAR from 
UvrB (18). If this is correct then the terminal arginyl residue 
of this motif may interact with the gamma-phosphate of 
the bound ATP. The only problem faced with correlating 
Hel to the previously noted 65 kDa protein is that mutants 
deficient in the gene specifying the latter protein have 
been isolated. It is not apparent why it was possible to isolate 
Tl mutants deficient in what appears to be a “redundant" 
gene. An alternative hypothesis on the role of the putative 
helicase gene advanced by German et al. (38) is based 
on the sequence similarity between this protein and 
RecQ. The latter protein is the helicase involved with RecF- 
mediated Holliday junction formation (8). 

Both phages also specify proteins (Tl, 306 amino acids: 
TLS, 308 residues) with homology to prophage (CP-9331. 
<j)R73, Fels-1) and bacteriophage (P4) primases and, small 
single-stranded DNA binding proteins (SSB analogs). Our 
analysis and the results of Bourque and Christensen (10) 
suggest that leading strand synthesis involves PolC, DnaX, 
and DnaN proteins. The synthesis of the lagging DNA 
would, like that of T4, require SSB and a primosome contain¬ 
ing presumably the phage helicase and primase proteins. 
The apparent requirement for both the host DNA primase 
(DnaG) and the phage homolog cannot be explained at 
this time. Therefore, it would appear that Tl replication 
is an example of a relatively simple DnaA-independent repli¬ 
cation strategy which involves primase and helicase analogs 
but has additionally done away with the need for DnaC. 
Again, the presence of phage-encoded SSB homologs is 
quite common since they have been observed with phages 
such as T3,T4,T7,933W, PI, and prophage CP-933V 

Using Grigoriev DNA skew analysis, the inflection, 
indicative of the replication origin, occurs within helA as 
it is in Salmonella phage P22. Phage replication origins 
are frequently characterized by iterons such as are found 
with rlt (123), A2 (71), BK5-T (67), c|)31 (66), and TP901-1 
(73). In neither phage Tl nor TLS is there evidence for 
iterons within helA. 

Previous studies on phage Tl demonstrated that it 
encodes a general recombination system (“ grn ”) composed 
of two genes (27, 85). While host RecABC will not substitute 
for Tl grn. the E. coli RecE (exonuclease VIII) recombina- 
tional pathway will (80). Both phages Tl and TLS specify 
homologs of the host RecE protein and for an Erf homolog. 
The latter is a member, along with host RecT, of a single¬ 
strand DNA annealing protein family which also functions 
in recombination. As with T7 replication there is no 
evidence for circular intermediates in Tl replication (81, 86, 
87) and it is assumed that concatenated molecules, which 
are the substrate for headful packaging, are generated by 
end-to-end recombination. These are lacking in grn mutants 
leading to failure to produce packagable DNA. 
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Packaging 

Conserved amongst all tailed bacteriophages is the two- 
subunit terminase complex which cuts DNA and packages 
it into proheads through the portal protein orifice. The 
smaller of the two terminase proteins is thought to recognize 
a packaging site, while the larger interacts with the 
portal protein, and catalyzes an ATP-dependent cleavage of 
the template (118). Depending on the terminase struc¬ 
ture and phage DNA sequence, there are three common 
mechanisms used to package DNA into a phage head 
from the poly-genomic template called concatemers. 
These are unit-full (typified by X or T7), random headful 
(characterized by T4), and pac site-initiated headful as 
observed with P22 (118). 

Tl-like phage terminases recognize a pac site on the 
concatemers to catalyze only the first round of packag¬ 
ing. Subsequent rounds of packaging are initiated at the 
original cut site with the finishing of subsequent rounds 
occurring somewhat randomly as the head fills with 
over one genomic length of DNA. This creates virion 
DNA with terminally redundant ends of approximately 
1500-2200 bp in P22 (30, 97), 1000 bp for TLS (German 
and Misra, unpublished observations), and 2800 for T1 
(65). Another corollary to headful packaging is that the 
DNA packaged in one virion has a different gene order 
compared with another virion packaged in a different 
round: this phenomenon is called circular permutation. 
With phage P22, Tl, and TLS packaging does not result 
in an infinite number of permuted molecules because 
only three or four rounds of packaging occur on a given 
concatemer. The telltale symptom of pnc-mediated pack¬ 
aging is that genomic DNA cut with restriction enzymes 
reveals submolar fragments resulting from only a portion 
of the mixed population (62). For instance, the TLS 
genome contains a 6 kb distinct submolar Hindlll gener¬ 
ated fragment which is created by pac cut near base 1 of 
the TLS genome and a HmdIII site 6 kb downstream. 
There are also two submolar highly diffused bands at 
approximately 5 and 4 kb which presumably derive from 
a second and third packaging events and the common 
downstream Hindlll site. 

The P22 pac consensus site was initially proposed to be 
AAGATTA (20) and then modified through mutagenesis 
studies to GAAgATTTatCTGaaGT (118). The pac site for 
TLS is located within an 81 base stretch of DNA that does 
not contain cytosines in the top strand, and near the end 
of this stretch is a 5'AGATTT3' sequence, where the first 
‘A” is the first base of the TLS genome. Likewise a stretch of 
60 bases without a cytosine in the top strand and a 
5'AGATAT3' at its end is found and the “G" would be the 
225th base oftheTl genome (German, unpublished observa¬ 
tions). This location matches well with previous studies 
that indicated the Tl pac site was approximately 1 kb 
upstream from one EcoRI site at 1323 bp (82). 


Besides pac the Tl phage requires another DNA locus 
called pip. Overall packaging efficiency is reduced in Tl pip 
mutants and they are only able to catalyze a single round 
of packaging from templates (28). The pip phenotype 
is somewhat reminiscent of that observed with X cosQ 
mutants, which are also only able to package the first event 
(114). In comparing the genetic and genomic maps, pip is 
located near dam (27). We have been unable to define the 
nucleotide sequence of pip. 

Phage Assembly 

The genetic map for Tl consisted of a block of 10 head 
genes (27), two of which likely encoded the small and large 
terminase subunits. TLS and Tl genomes have at least 
13 genes from the small terminase subunit to just before 
the predicted tail subunit. In the “head region” their gene 
maps are considerably different from the eight prototypical 
patterns found in diverse genomes that are postulated 
to be a X-supergroup of Siphoviridae (15). For instance, Tl 
and TLS appear to have the genes for two proteolytically 
processed head subunit homolog: subunit A (Tl gene 
50 and TLS gene 33) and subunit B (Tl gene 47 and 
TLS gene 36). SDS-PAGE analysis (figure 17-4) of Tl and TLS 
showed a common abundant band at 27.9 kDa (for Tl) 
or 29.1 kDa (for TLS). Mass spectroscopic analysis of Tl’s 
band indicated that it is a 26,588 Da processed product and 
is coded by Tl gene 47 (91). The Tl gene 47 corresponds 
to TLS gene 36 and they code for protein products of 35,290 



Figure 17-4 SDS-PAGE analysis of the structural proteins. 
Left: Phage Tl. Right: Phage TLS. Approximately 10 10 phage 
particles were loaded per lane. Gel was stained with 
Coomassie blue. 
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and 36,536 Da, respectively. The processing of the head 
subunit is reminiscent of HK97’s where a 42 kDa precursor 
is processed to 31 kDa (29). Furthermore, the small size of 
the single gene between the portal gene and the first head 
subunit suggests that, like HK97, T1 gene 51 and TLS gene 
32 code for a head protease. An extended 3' portion found 
in T1 gene 48 versus TLS gene 35 reflects a difference 
between the genomes of two phages. This extended portion 
codes for an immunoglobulin-like domain, which is often 
found in phages that have head decoration proteins like 
T4’s hoc (21). X also encodes a head decoration protein 
(gpD) while P22, HK97, and L5 do not (19, 29). 

The SDS-PAGE analysis of the two phages shows the 
presence of several additional bands for T1 (Figure 17-4). 
One band difference is possibly due to the T1 48 gene 
product (calculated size 26,588 Da) while the corresponding 
gene 35 product for TLS (calculated size 16,355 Da) could co¬ 
migrate with a putative FH-like head-tail joining protein 
found at a calculated 16,615 Da for TLS gp 34. There is also a 
corresponding protein band at 17,007Da for Tl. The portal 
proteins of both phages are detected by SDS-PAGE analysis 
to be at their predicted locations of roughly 50 kDa 
(figure 17-4). Like the P22 portal protein (92), which is a 
12mer, the Tl and TLS portal proteins appear to require 
cysteine cross-linkages because under nonreducing condi¬ 
tions the 50 kDa band is not detectable for either phage. Tl 
is also expected to carry an injectable phage protein in its 
head. This is based on observation that host protein synthe¬ 
sis was blocked at the level of messenger RNA translation 
before the commencement of any viral protein (109). 

Tl was genetically shown to have at least nine genes 
responsible for producing phage tails (27). Eight of the nine 
were clustered after the head genes. The last gene was sepa¬ 
rated from the tail gene cassette by two recombination genes 
and thought to code for a tail assembly factor. The genomic 
map has shown that the tail assembly factor gene is actually 
the side tail fiber (stf) gene. Contrary to other known phage 
genomes, the transcription of the side tail fiber gene is in 
the direction opposite to the rest of the tail genes. 

TLS’s and Tl’s side tail fibers share strong homology 
for only the first 100 amino acids or roughly 20% of the 
protein. Thereafter TLS gp55 only shows homology with 
a Salmonella typhi prophage protein, while Tl gp26 shows 
exclusive homology with a hypothetical protein from Photo- 
rhabdus luminescens. Likewise, in the last published electron 
micrograph of Tl, it was shown to have short and club- 
shaped tail fibers (47). Electron micrographs of TLS show 
long and thin tail fibers containing a kink (unpublished 
observations). Tl-like phages contain an extremely flexible 
tail. The tail subunit, which is most likely the factor for the 
flexible tail, is not homologous to any other known phage 
but rather to prophage from Yersinia. The homology to 
known phages (HK97 and N15) starts with the tail tape 
measure gene and continues for the rest of the cassette, and 
therefore is likely to assemble in the prototypical pathway 


outlined for X (19, 49). The last gene in the cassette, tspj. 
codes for the predicted tail spike and host specificity factor. 
In X, host range mutants were obtained by alterations local¬ 
ized at the C-terminus of the tail spike (112). Differences 
between Tl and TLS tail spike sequences are rather minor 
except that Tl contains a 30 amino acid repeat (starting at 
the 906th amino acid) preceding the region of X host range 
mutants while TLS contains at least one insertion of 18 
amino acids following this region (starting at the 1103 rd 
amino acid). 

Lysis 

At the end of the latent period bacteriophage release is 
brought about through the concerted action of a holin 
which creates pores in the inner or cytoplasmic membrane, 
and an endolysin, which escapes through the pores to 
hydrolyze the peptidoglycan layer in the periplasm. In 
almost all cases these proteins are specified by a two-gene 
lysis cassette in which the holin gene precedes or overlaps 
the lysin gene (120,121; chapter 10). Holins are characterized 
by their relatively small size (67-145 amino acid residues), 
usually contain two or three membrane-spanning helices, 
possess a charged C-terminus, and exhibit poor sequence 
identity to other members of this group of functionally simi¬ 
lar proteins (40, 120, 121). The putative TLS and Tl holins 
contain predicted single-transmembrane domains and 
possess positively charged C-termini. Examples of other 
single-transmembrane domain holins include the Borrelia 
burgdorferi cp32 prophage holin protein BlyA (23), Mycobac¬ 
terium phage Ms6 hoi (34), and Haemophilus in fluenzae phage 
HP1 hoi (31). Phage have evolved a variety of murein hydro¬ 
lases and in the case of these two phages the endolysin genes 
specify a lysozyme which cleaves the (3-1,4-linkages between 
adjacent N-acetylmuramic acid and N-acetylglucosamine 
residues in cell wall peptidoglycan. 

In the case of both phages, downstream of the lysis 
cassette are located genes whose amino acid sequences 
reveal two transmembrane domains at both ends of the 
protein. No homologs have been discovered. Whether these 
putative gene products function as additional holins or are 
similar to the poorly defined lysis proteins Rz; 122 in phages 
such as X and P22, is unknown. 

Miscellaneous 

Tl possesses three homing endonucleases of the HNH 
group, while TLS has two. These site-specific endonucle¬ 
ases are found in group I and group II introns (59) or as 
independent genes in bacteriophages (7,19, 22). In all cases 
the proteins are homologous to Xanthomonas oryzae 
phage XplO (GenBank accession no. AY299121) and 
Yersinia enterocolitica phage <j)Ye03-12 (GenBank accession 
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no. NC-001271) endonucleases. While phage XplO has 
10 homing endonuclease genes, those from TLS and T1 
are only homologous to four. 

Morons are sequences, flanked by promoter and tran¬ 
scriptional terminators, often found inserted within a 
group of co-transcribed genes (48). Examination of the 
T1 genes (figure 17-2) reveals that or/30 lies in the opposite 
orientation to its flanking genes and its orientation in 
coliphage N15. Furthermore, it is separated from orf31 by 
a transcriptional terminator. This gene specified a protein 
called Cor involved in N15 lysogenic conversion gene, 
which is responsible for surface exclusion of Tl, cj>80, and 
N15 (83). Homologs are also synthesized by temperate 
phage HK022 (48) and </>80 (68). Why a virulent phage such 
as Tl should specify such a gene is unknown. TLS genes 52 
and 60 are examples of genes inserted between mutually 
conserved regions of the two phages. 

Evolution of the Tl-Like Phages 

Genomic analysis, predominantly on the temperate coli- 
phages and viruses infecting bacteria involved in the dairy 
industry, has revealed the truth in the assertion that phage 
evolve through recombinational exchange with a large 
common genetic pool (14, 25, 43-45; chapter 4). The conse¬ 
quence of this is that many phage genomes are genetic 
mosaics with regions of homology interspersed with regions 
which have no homology. Access to this genetic pool will 
vary depending upon the ecological isolation of the host 
and the physiology of the phage. It has been argued that the 
virulent phages, particularly those that induce massive 
degradation of host DNA and have a well-organized replica¬ 
tion or packaging strategies, will exhibit a lower incidence of 
horizontal gene transfer (55, 76). The high degree of homol¬ 
ogy between the essential genes of Tl and TLS and the strik¬ 
ingly similar overall genomic layouts in these two phages 
unequivocally confirm that they belong to the same lineage. 


Their genomes are also mosaics as revealed by the observa¬ 
tion that the Tl/TLS genomes contain regions with clear 
homology to other phage and prophage genomes. Homologs 
occur to phages infecting bacterial phyla Actinobacteria 
(mycobacteriophage Bxzl), Firmicutes ( Lactococcus phage 
LL-H) and, as expected, members of the Proteobacteria. 
Two areas of particular interest are genes 38-30 (Tl) and 
45-50 (TLS) which are related to coliphage N15, and, with 
the exception of Tl gene 42, genes 44-33 (TLS genes 39-50 
except 41), which are related to contiguous prophage 
sequences in Yersinia pestis. 

One of the more interesting aspec ts of the genomics of Tl/ 
TLS is the presence of four linked genes which have been 
implicated in tail assembly in a number of members of the 
Siphoviridae infecting, or carried by, members of the class 
y-Proteobacteria orders Enterobacteriales (Salmonella and 
Escherichia) or Alteromonadales Shewanella ; (42). Phage 
4>E125 resides in Burkholderia thailandensis (116), a member 
of the class (3-Proteobacteria. A further unifying feature is 
that all the free-living phages (cf>E125, HK97, HK022, N15, 
and <|)80) are classified as X-like viruses at NCBI, suggesting 
that at a higher phylogenetic level Tl/TLS might be con¬ 
sidered to be part of a possible order “X" within the Siphovir¬ 
idae. Taking a “total evidence approach" to the origin of Tl 
a phylogenetic tree was constructed by alignment of “poly- 
proteins” composed of the gp37-34 and their homologs, 
with a Elomo sapiens protein as the outlier. The results 
(figure 17-5) offer robust support for the existence of several 
lines of descent, which include X and its prophage relatives 
(Gifsy-1, Gifsy-2, and Fels-1), the N15-HK97-HK022 cluster, 
and three deeply rooted clades involving Tl/TLS, 4>E125, 
and XSo. The moron carrying Cor apparently was present in 
the phage genomes before the branching which gave rise to 
N15-HK022 andTl, and was lost in TLS and HIC97. 

It has been conservatively estimated that E. coli and 
Salmonella enterica diverged 120 million years ago (Myr) 
(60, 72). Employing the approach of Whittam et al. (113) we 
aligned common genes of E. coli K-12 with those of S. enterica 



Figure 17-5 Phylogenetic analysis of tail cone assembly proteins using a “total evidence approach.” The tree was 
constructed by alignment of “polyproteins” composed of the Tl gp37-34/TLS gp 46-49 and other homologs recovered 
from GenBank. The alignments were constructed using ClustalW (46) and analyzed further using TreeCon (106). 































220 PART IV: INDIVIDUAL TAILED PHAGES 


sv Typhimurium showing that these bacteria display an 
average of 18.3% divergence in the nucleotide sequence. 
From these values we calculated that a divergence of 1% is 
equivalent to 6.5 Myr. Since phages are obligate parasites 
it has been hypothesized that they evolve at a rate similar 
to their hosts (44). Using the phage terS. terL, recT, hel, 
portal, major head, and tail tape measure genes as chrono¬ 
meters of evolution, T1 and TLS exhibit 40.5% divergence. 
This is equivalent to 260 Myr. Using the same approach 
with the terS-major head gene cluster and the temperate 
lambdoid phages ETK97 and HK022 (1.7% divergence), their 
last common ancestor was a mere 11 Myr ago. Two possibili¬ 
ties exist to explain these values: either the Tl-like line of 
viruses is extremely old, or the rate of evolution of temperate 
phages is slower than that of their virulent cousins. 


Unanswered Questions 

While the completed genomic sequences of T1 and TLS have 
answered a number of important questions about these 
phages there are still areas in need of research. 

1. Transcription: how transcription is temporally regu¬ 
lated; the number of promoters and the potential role 
of the 21 nucleotide direct repeats in transcription. 

2. DNA replication: mechanism(s) of host DNA degrada¬ 
tion: role of Tl/TLS primase and helicase in phage 
DNA replication. 

3. Modification: role of Dam and Dcm methylases. 

4. Proteins: the cellular location and function of the large 
number of putative proteins showing transmembrane 
domains, and the correlation of T1 amber mutants 
with genes. 

5. Tails: a hallmark of Tl-like phages is that the tails are 
extremely flexible. Is the subunit geometry different 
from other phages? What is the nature of the receptor 
for the two tail spike proteins? 

6. The mode of DNA entry. 

7. A detailed study of morphogenesis of these phages and 
the location and role of the structural proteins. 

8. Lysis: we have predicted that the holin has a single 
predicted transmembrane domain. The lysis cas¬ 
sette holin-endolysin-Rz homolog requires further 
analysis. 

9. Stability: what physiochemical aspect of Tl's struc¬ 
ture contributes to the phage’s extraordinary resilience 
to desiccation and persistence in aerosols? 
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History 

Bacteriophages T2,T4, andT6 were among seven Escherichia 
coli phages selected by Max Delbriick to study funda¬ 
mentals of viral replication in a limited number of model 
viruses. These studies led to the first formulation of many 
concepts that are now accepted foundations of molecular 
biology: the fundamental differences between growth of 
viruses and cells (figure 18-1) (109); the demonstration 
that nucleic acids of virus particles suffice to establish infec¬ 
tion and to direct synthesis of complete virions (163); the 
concept of the gene, including distinctions between units of 
recombination, mutation, and function (30); genetic recom¬ 
bination as exchange between DNA molecules involving 
“heterozygous” overlaps (91, 164,165); the demonstration of 
messenger RNA (mRNA) (55, 414) and the non-overlapping 
triplet code (83, 390) with nonsense triplets as termina¬ 
tion signals (31); the repair of DNA damage in the light 
(103) and in the dark (149); restriction and modification 
of DNA (247); the presence of spliced and nonspliced 
introns in prokaryotes (25, 171); the definition of pathways 
leading to the assembly of complex macromolecular struc¬ 
tures (105); and the importance of protein complexes 
(machines), which change composition during various DNA 
transactions (9, 275). 

Many other phages from different parts of the world 
are classified within this family, based on similar sequences 
and map positions of their essential genes, regulatory 
patterns, virion structure and serological properties. They 
are now collectively called the “T-even” phages (192). More 
distantly related phages have been called pseudo-T-even 
phages (2, 13, 148, 266, 397) or schizo-T-even phages, 
although the criteria that distinguish them are ambiguous. 
These phages infect many different Gram-negative bacteria 
in various environments, from mammalian intestines to 
marine cyanobacteria and other bacteria. 


Overview 

The genomes of the T4-related phages are contained in 
large (~170,000 kb), linear, double-stranded (ds) DNA mole¬ 
cules whose termini contain repetitions of approximately 
3% of the genome. The termini are randomly permuted 
over circular maps (figure 18-2) (273, 276, 289, 388, 389, 399). 
The DNA of most T4-related phages contains hydroxy- 
methylcytosine (HMC) instead of cytosine. In most members 
of the family, the HMC residues are further glycosylated 
to different extents. These modifications together allow 
escape from host and phage restriction enzymes and are 
important for the developmental strategy of these phages. 

DNA is packaged in elongated “heads” of quasi- 
icosahedral symmetry. DNA-filled heads are joined to 
independently assembled tails whose baseplates and fibers 
(figure 18-3) are instrumental for recognition, adsorption, 
and injection of the DNA into host bacteria. Different 
phages recognize different receptors in different host 
strains. Apparently, recombination within their genomes 
or with genomes of plasmids and other phages or pro¬ 
phages facilitates rapid evolution of T-even tail fiber genes 
and adaptation to different hosts (156, 396, 397). 

T4 can grow well in many other Gram-negative 
bacteria if the bacteria are converted to spheroplasts 
and the phage are treated with urea (416). The urea circum¬ 
vents the first adsorption stage (and host specificity) by 
altering the baseplates and tails of the phage particles so 
they can release their DNA into spheroplasts upon contact 
with bacterial inner membranes. The final injection of 
DNA requires membrane potential (135). 

T-even phages are some of the most successful molecular 
parasites. Their genomes code for most phage-specific DNA 
replication, recombination, and repair functions. Because 
many of these proteins are similar in structure and function 
throughout all living organisms, the analysis of T4 proteins 
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Figure 18-1 Overview of the T4 life cycle. Modified from (253). 


has greatly contributed to general understanding of these 
processes. 

Nevertheless, like all viruses, T-even phages depend for 
their propagation on many vital structures and functions of 
their hosts, such as membranes, energy metabolism, tran¬ 
scriptional and translational machines, and some chaper¬ 
ones. They manage to usurp host structures and functions 
in an exquisite choreography that allows adaptations to 
different environmental conditions, including the different 
physiological states of the host. 

T4 is the most thoroughly investigated member of the 
T-even phages, mainly because the isolation of a large collec¬ 
tion of conditional lethal mutants (110) provided a powerful 
impetus for molecular analyses by biochemical and biophy¬ 
sical methods. The results of the combined efforts of many 
research groups working on T4 and related phages are 
summarized in a monograph (192) and in a recent review 
that emphasizes bioinformatic aspects of the annotated T4 
genome (262). Comparisons with other members of this 
phage family have been reviewed (87, 225, 226, 266) and 
will be extended as sequences of other T4-related phages 
are being published. References cited here are mainly to 
summarizing chapters in (192) and to papers published 
since then. 

We emphasize that the recipe for T4's success as a mole¬ 
cular parasite is based on multiple redundant pathways 
for most, if not all, physiologically important DNA transac¬ 
tions: transcription, translation, replication, recombination, 
repair, and packaging. These pathways are interconnected 


at many levels (figure 18-4). Many of the cross-connections 
are promoted by certain proteins that can participate in 
different complexes and pathways. This allows communi¬ 
cation between different pathways and their adaptation 
to different environmental conditions. The redundancies 
are most likely also important for evolution. However, for 
the sake of clarity, we discuss these different pathways and 
their regulation during phage development separately, after 
this overview. 

T-even phages (in contrast to theT7- or NT-related phages 
(chapters 20 and 21) use the core host RNA polymerase 
throughout the infectious cycle, and the gradual subversion 
of host functions to support different aspec ts of phage propa¬ 
gation is achieved in many small steps: 

1. A cascade of phage-induced proteins covalently and 
noncovalently modifies the host RNA polymerase and 
its accessory transcription factors (291). Together 
these modifications modulate processivity (termina¬ 
tion) of elongating RNA polymerase and allow sequen¬ 
tial recognition of different classes of promoters to 
selectively transcribe HMC-containing DNA. Thereby 
all host transcription is inactivated, and different 
classes of phage promoters are temporally controlled. 

At least one of the T4 proteins, an enzyme that ADP- 
ribosylates one a-subunit of host RNA polymerase, is 
packaged into virions and injected with the phage 
DNA into the next host bacterium. The sliding clamp 
required for DNA replication is also required for late 
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Figure 18-2 T4 gene maps. The outer circle shows the approximate positions of characterized genes on the 169,903 base pair 
DNA, drawn as a circle. The next three genome segments show three maps derived from determining distances 
of chromosomal ends from rl, rll, and rill respectively (278). Note that the relative length of the small rl molecules was 
0.77, not 0.68 of the normal T4 chromosome. The next (full) circle indicates the positions of genes derived from genetic 
crosses (105). The innermost circle indicates the positions of heteroduplex loops after annealing of heat-denatured T2, 

T4, and T6 DNA (201). 


transcription, connecting these two processes. The 
classification of transcripts is complex, as described 
below, largely because most genes are transcribed 
from multiple promoters, and protein synthesis is 
affected by subsequent translational controls. 

2. RNA processing by phage and host enzymes, trans¬ 
lational repressors, and poorly understood modifi¬ 
cations of ribosomes all modulate T4 gene expression 
(69, 261). No transcriptional repressors (in the original 
meaning of the word) are known. Most likely, post- 
transcriptional modulators, including translational 
repressors, are more suitable than transcriptional 


regulators for adjusting to rapid physiological 
changes during the short T-even development: one 
growth cycle is completed in less than 30 minutes 
at 37 °C. 

3. The host DNA and host mRNA, present at the time 
of infection, are rapidly degraded, and the breakdown 
products are efficiently reused to synthesize phage 
DNA and RNA. 

4. The onset of the first phage DNA replication from 
specific origins requires host RNA polymerase to 
generate primers and is thereby influenced by the 
physiological regulatory processes of the host. Most 
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Figure 18-3 Structure of the T4 virion. Image is based on negative stain and cryo-electron microscopy, and crystallographic 
data. The locations of the protein components are indicated by gene number. The portal vertex composed of gp20 is 
attached to the upper ring of the neck structure, inside the head itself. The internal tail tube is inside the sheath and itself 
contains a structural component in its central channel. The baseplate contains short tail fibers made of gp12; these are 
shown in a stored or folded conformation. 


subsequent T4 replication is entirely phage-controlled, 
escaping the host’s controls. It depends on phage- 
encoded replication and recombination proteins and 
on primers that are intermediates of homologous 
recombination, thereby connecting replication, 
recombination, and repair of DNA (82,217,218, 277). 

5. During the later stages of development, proteins 
involved in packaging DNA become also important 
for DNA replication, thereby coordinating these two 
processes (290). 

Genome Structure and Map 

The genome of T4 resides in 169,903 base pairs (bp) 
of dsDNA containing glucosylated HMC residues (262). 
The HMC residues are glycosylated to different extents in 


different T-even phages. The complex modification and 
restriction of T4 DNA, further discussed later in this 
chapter, can best be rationalized as the result of an 
ongoing evolutionary process that includes exchanges 
between the phage, its host, and prophages and plasmids 
resident in the host. Most T-even phages destroy dCTP, 
synthesize dHMCTP, and use the latter for their own 
DNA synthesis. 

Using genetic tricks, T4 mutants containing unmodified 
cytosines in their DNA have been isolated (68, 228). Their 
DNA has been instrumental in cloning and sequencing 
the T4 genome, but the lack of modification affects restric¬ 
tion and regulation of transcription termination (98), and 
limits viability and host range (338). Moreover, as dis¬ 
cussed below, origin initiation of DNA replication (439) 
and packaging (241) appear to be different from that of 
wild-type T4. 
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Figure 18-4 Diagram of the relationship between the T4 transcriptional pattern and the different mechanisms of DNA 
replication and recombination. The upper panel shows the transcripts initiated from early, middle, and late promoters by 
sequentially modified host RNA polymerase. Hairpins in several early and middle transcripts inhibit translation of the late 
genes present on these messengerRNAs. The lower panel depicts the pathways of DNA replication and recombination 
detailed in this chapter. Hatched lines represent strands of homologous regions of DNA and the arrows point to possible 
positions of endonuclease cuts. Replication can be only initiated from cuts marked by filled arrows. Cuts indicated by open 
arrows cannot be used to initiate replication forks. Modified from (279). 


Mature T4 DNA molecules (chromosomes) packaged 
into virions are linear. In contrast, intracelluiarly replicat¬ 
ing DNA contains multiple covalently linked copies of the 
genome, called “concatemers” (117, 399), which are highly 
branched as a result of recombination and replication. 
These concatemers are cut during packaging (see below) 
at nearly random positions of the circular map (figure 18-2). 
Because the heads are filled with DNA corresponding to 
approximately 103% oftheT4 genome, processive packaging 
generates the circular permutations of chromosomal ends 
(388) and ensures that each chromosome is diploid for more 
than 5000 bp at the ends (so-called terminal redundancies). 
The terminal redundancy allows homologous recombina¬ 
tion between two terminal regions of the same chromosome 
and recombination-dependent DNA replication (85, 280) 
after infection of bacteria with a single phage particle. 

Numerous mutations, and their assignments to comple¬ 
mentation groups and open reading frames (ORFs), have 
defined approximately 160 proteins with known functions 
(table 18-1). More than 120 additional ORFs of unknown 
function can be deduced from the DNA sequence (262). 
Most, if not all of these ORFs are transcribed, and some of 
their protein products have been detected by polyacryl¬ 
amide gel electrophoresis. There are many overlapping 


coding regions: several genes direct synthesis of more than 
one protein from in-frame or (in two cases) out-of-frame 
internal start codons: many ORFs are very small; and in at 
least one region (replication origin E) both complementary 
DNA strands encode proteins. In spite of the large genome 
size, only a few short regions are devoid of coding capac¬ 
ity. These and other observations suggest that appar¬ 
ently redundant and “nonessential" genes confer selective 
advantages. 

In contrast to many other phages (e.g., the iambdoid and 
T7-related phages; chapters 27 and 20),T-even early and late 
transcription units are not clearly separated but are inter- 
digitated (e.g., late genes 26. 25 and the middle gene uvsY). 
Most genes are transcribed from multiple promoters, and 
genes for related or interacting proteins are not necessarily 
clustered (a most striking example is genes 5 and 27, which 
code for the first interacting baseplate components; 187). 
Moreover, the direction of transcription does not unam¬ 
biguously distinguish between early and late genes. In fact, 
in many cases early and late genes are cotranscribed (291) 
(figure 18-4). In this respect as well as in the amino acid 
sequences of some of their proteins (e.g., DNA polymerase 
and terminase), the T-even phages more closely resemble 
the herpesviruses than certain other phages. 
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Size 


Restrictive host 

Gene 

Function of gene product 

(kDa) 

Mutant phenotype 

or condition 

rllA 

Membrane-associated protein; 

82.9 

Rapid lysis; suppress T4 30 

rex + X lysogens; P2-like 


affect host membrane ATPase 


and some 32 mutations 

FIK239 lysogen; tabR 

60 

DNA topoisomerase subunit 

18.6 

DNA delay; rc = acriflavine 
resistance 

25 °C or below 

mobA 

Pseudogene of Mob site-specific DNA 
endonuclease 

4.2 


Nonessential 

39 

DNA topoisomerase subunit; 

58.0 

DNA delay; rc = acriflavine 

25 °C or below; synthetic 


DNA-dependent ATPase; 


resistance 

lethal with T4 49 and 17 


membrane- associated protein 



mutations, or when host 
topoisomerase IV is poisoned 
with novobiocin 

plaC T r5x 




CTr5x 

goF=com C-a = 

Affects mRNA metabolism 

16.7 

Allows T4 growth in NusD rho 

Auxiliary 

go9H 

cef=m b = 

Processing of T4 tRNAs 

8.5 

hosts 

Auxiliary; CT439; roc~ 

Ml~motC 




hosts 

pse F = plaCTr5x? 

5' phosphatase 



Auxiliary 

motB 


18.2 

Affects middle transcription 

Auxiliary 

dexA 

Exonuclease A 

26.0 


Auxiliary; restricted on optA~ 
hosts 

dda = sud 

DNA helicase; DNA-dependent 

49.9 

Suppress certain T4 32 mutations 

Auxiliary; synthetic lethal 


ATPase 



with T4 59 mutations 

srd = dda.2 

Postulated decoy of host a 70 or aS 

29.0 


Auxiliary 

mod A 

Adenylribosylating enzyme 

23.4 

a-subunits of host RNA 

Auxiliary 




polymerase are incompletely 
modified 


modB 

Adenylribosylating enzyme 

24.2 


Auxiliary 

srh — modA.5 

Postulated decoy of host a 32 

8.1 

Delays early T4 gene expression 
at high temperatures 

Auxiliary 

mrh 

Affects phosphorylation of host a 32 

18.5 

Allows T4 growth in a a 32 host 

Auxiliary 

soc 

Small outer capsid protein 

9.1 

Unstable T4 capsids 

Auxiliary 

segF= 69 

Intron-like endonuclease. A probable 
fusion protein, generated from 56 and 

69 by hopping of ribosomes across 
a pseudoknot, is larger 

26.2 


Nonessential 

56 

dCTPase; dUTPase; dCDPase; dUDPase 

20.4 

Little DNA synthesis; 
unstable DNA 

Essential 

oriA 

several seguences in 56, 69, and soc 
required in c/s; primer transcript same 
as transcripts for these genes 


No replication from origin A 

Auxiliary 

dam 

DNA adenine methylase 

30.4 

No DNA adenine methylation 

Auxiliary 

61=58 

Primase; requires interaction with gp41 

39.8 

DNA delay 

Auxiliary; 25 °C or below; 


helicase for priming at unique 



synthetically lethal with T4 


sequence 



49 or 17 mutations 

sp = 61.3~ rIV 

Periplasmic protein 

11.0 

Rapid lysis; suppresses e 
lysozyme mutations 

Auxiliary 

dmd=61.5 

Discriminator of mRNA degradation 

7.0 

Excessive mRNA degradation 

Nonessential; suppressed 
by motA mutations 

41 

Replicative and recombination DNA 

53.6 

DNA arrest; little DNA 

Essential 


helicase; GTPase; ATPase; dGTPase; 
dATPase 


displacement synthesis 


40 

Membrane-associated protein initiator 

13.3 

Polyheads 

Auxiliary; high 


of head vertex 



temperatures 

uvsX = fdsA 

RecA-like recombination protein; 

44.0 

UV- and X-ray sensitive; 

Auxiliary 


DNA-ATPase 


recombination deficient; 
suppress 49 mutations 


segA 

Site-specific intron-like DNA 
endonuclease 

25.3 


Nonessential 

P-gt 

p-glucosyltransferase 

40.7 

No p-glucosylation of FIMC DNA 

Auxiliary; Shigella 
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Size 


Restrictive host 

Gene 

Function of gene product 

(kDa) 

Mutant phenotype 

or condition 

42 

dCMP hydroxymethylase 

28.5 

Little or no DNA synthesis 

Essential 

imm 

Inner membrane protein 

9.3 

No immunity to superinfection, 
membrane protein 

Auxiliary 

43 

DNA polymerase; 3' to 5' exonuclease 

103.6 

No DNA synthesis; mutator or 

Essential; nonessential dsd 




antimutator activities of 

mutants do not grow in 




conditional lethals under 
semipermissive conditions 

optA hosts 

regA 

Translational repressor of several 

14.6 

Extended synthesis of 

Auxiliary; restricted 


early genes 


several early proteins 

in rpoB5081 at 42 °C 

62 

Clamp-loader subunit 

21.4 

No DNA synthesis 

Essential 

44 

Clamp-loader subunit 

35.8 

No DNA synthesis 

Essential 

45 

Processivity enhancing sliding clamp 

24.9 

No DNA synthesis; 

Essential 


of DNA polymerase; and mobile 
enhancer of late promoters 


no late transcription 


rpbA 

RNA polymerase binding protein 

14.7 


Auxiliary 

46 

Recombination protein and nuclease 

63.6 

Recombination deficient; 

Essential in B strains; 


subunit 


DNA arrest; no host DNA 

mutants are “leaky” in 




degradation 

some K strains 

47 

Recombination protein and 

39.2 

Recombination deficient; 

Essential in B strains; 


nuclease subunit 


DNA arrest; no host DNA 

mutants are “leaky” in 




degradation 

some K strains 

a-gt 

a-glucosyl-transferase 

46.7 

No a-glucosylation of HMC 

Auxiliary 

mobB 

Putative site-specific intron-like DNA 
endonuclease 

30.4 


Nonessential 

55 

a factor recognizing late T4 promoters 

21.5 

No late transcription 

Essential 

nrdH = 55.7 

Anaerobic nucleotide reductase subunit 

11.7 


Auxiliary 

nrdC = 55.9 

Anaerobic nucleotide reductase subunit 

18.2 


Auxiliary 

mobC= 55.10 

Putative site-specific intron-like DNA 
endonucleasee 

24.0 


Auxiliary 

nrdD = sunY 

Anaerobic ribonucleotide reductase 
subunit; RNA contains a self-splicing 
intron 

68.064 


Anaerobic growth 

t-Tev II 

Endonuclease for nrdD-intron homing 

30.4 


Auxiliary 

49 

Recombination endonuclease VII 

18.1 

No resolution of recombination 

Essential 




junctions; incomplete packaging 
of DNA; reduced heteroduplex 
repair, reduced DNA synthesis 


49' 

Internal translation initiation product 

11.9 



pin 

Inhibitor of host Lon protease 

18.8 

No degradation of amber peptides 

Auxiliary 

nrdC 

Thioredoxin, glutaredoxin 

10.1 


Auxiliary 

mobD 

Putative site-specific DNA endonucleasee 

30.5 


Nonessential 

rl — tk.-2 

Membrane protein 

11.1 

No lysis inhibition 

Auxiliary 

tic 

Thymidine kinase 

21.6 


Auxiliary 

vs 

Modifier of valyl-tRNA synthetase 

13.1 


Auxiliary 

reg B 

Site-specific RNase 

18.0 

Misregulation of early genes 

Auxiliary 

denV 

Endonuclease V; N-glycosidase 

16.1 

UV-sensitive 

Auxiliary 

ipll 

Internal protein II 

11.1-9.9 


Auxiliary 

iplll 

Internal protein III 

21.7--20.4 


Auxiliary 

e 

Soluble lysozyme; endolysin 

18.7 

No cell lysis 

Essential, except 
when suppressed 
by sp and 5 
mutations 

nudE = e. 1 

Nudix hydrolase 

17.0 


Auxiliary 

goF3 



Allow T4 growth in nusD rho hosts 

Auxiliary 

maC= Species 1 

Stable RNA 



Auxiliary 

moD = Species 2 

Stable RNA 



Auxiliary 

tRNA arg 



psu 4 opal suppressor 

CT439 

segB 

Probable site-specific intron-like DNA 
endonuclease 

26.2 


Nonessential 
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Size 


Restrictive host 

Gene 

Function of gene product 

(kDa) 

Mutant phenotype 

or condition 

tRNA ile 




CT439 

tRNA tbr 




CT439 

tRNA ser 



psu a ; psu b ; psu^ amber 
suppressors 

CT439 

tRNA pm 




CT439 

tRNA gly 




CT439 

tRNA' eu 



psu 3 

CT439 

tRNA 9ln 



psu 2 ; SB 

CT439 

ipl 

Internal protein 1 

10.2--8.5 


CT596 

576 


17.2 


? 

57A 

Chaperone of long and short tail fiber 

8.7 

Defective tail fiber assembly 

Essential; by-passed 


assembly 



by certain host 
mutations 

1 

dNMP kinase 

27.3 

No DNA synthesis 

Essential 

3 

Head-proximal tip of tail tube 

19.7 

Unstable tails 

Essential 

2=64 

Protein protecting DNA ends 

31.6 

Noninfectious particles with 

Essential, except in 




filled heads 

recBCD hosts 

4 = 50 = 65 

Head completion protein 

17.6 

Noninfectious particles with filled 
heads but tails attached at wrong 
angles 

Essential 

53 

Baseplate wedge component 

23.0 

Defective tails 

Essential 

5 

Baseplate lysozyme; hub component 

63.1 - -44 
&1 9 

Defective tails 

Essential 

oriE 

c/s-acting sequences in genes 4, 53, 5; 
primer transcript in opposite 
orientation of gene 5 transcripts 


No DNA replication from oriE 

Auxiliary 

repEB 

Protein required for initiation from oriE 

5.48 

No DNA replication from oriE 

Auxiliary; synthetic 
lethal with mo tA 





mutation 

repEA 

Protein auxiliary for initiation from oriE 

6.13 

Anomalous DNA replication 
from oriE 

Auxiliary 

segC 

Site-specific intron-like DNA 
endonuclease 

22.2 


Nonessential 

6 

Baseplate wedge component 

74.4 

Defective tails; permit plating 
of fiberless phage 

Essential 

7 

Baseplate wedge component 

119.2 

Defective tails; permit plating 
of fiberless phage 

Essential 

8 

Baseplate wedge component 

38.0 

Defective tails 

Essential 

9 

Baseplate wedge component, tail fiber 
socket, trigger for tail sheath contraction 

31.0 

No attachment of tail fibers 

Essential 

10 

Baseplate wedge component, tail pin 

66.2 

Defective tails 

Essential 

11 

Baseplate wedge component, tail pin, 
interface with short tail fibers, gp12 

23.7 

Defective tails 

Essential 

12 

Short tail fibers 

56.2 

Defective tails 

Essential 

wac 

Whiskers, facilitate long tail fiber 
attachment 

51.9 

No whiskers 

Auxiliary 

13 

Head completion 

34.7 

Inactive, but filled heads 

Essential 

14 

Head completion 

29.6 

Inactive, but filled heads 

Essential 

15 

Proximal tail sheath stabilizer, connector 
to gp3 and/or gp19 

31.6 

Defective tails 

Essential 

16 

Terminase subunit, binds 
double-stranded DNA; 

18.4 

Empty heads 

Nearly essential 

16' 

Truncated C-terminal end 




17 

Terminase subunit with nuclease and 

ATPase activity; binds ssDNA, gp16 and 
gp20 

69.8 

Empty heads 

Essential 

77' A 

Terminase subunits with nuclease and 

59.2 



17'B 

ATPase activity; internal transcription and 
translation in frame; does not bind ssDNA 

57.1 
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Size 


Restrictive host 

Gene 

Function of gene product 

(kDa) 

Mutant phenotype 

or condition 

77" 

Terminase subunit with nuclease and 

46.8 




ATPase activity (transcript processing 
and internal initiation of translation in 





frame); does not bind ssDNA. Several 
additional proteins most likely initiated 
from internal ribosome binding sites 
of the 77 transcripts 




78 

Tail sheath monomer 

71.3 

Defective tails 

Essential 

79 

Tail tube monomer 

18.5 

Defective tails 

Essential 

20 

Portal vertex protein of the head 

61.0 

Polyheads 

Essential 

pip = 67 

Prohead core protein; precursor to 

9.1-- 

Defective heads 

Essential 


internal peptides 

small 

peptides 



68 

Prohead core protein 

15.9 

Isometric heads 

Essential 

21 

Prohead core protein and protease 

23.3-- 

small 

peptides, 

No or defective heads 

Essential 

27' 

Prohead core protein and protease 

20.8-- 

Defective heads 



(internal initiation of translation) 

small 

peptides 



22 

Prohead core protein; precursor to 

29.9-- 

No or faulty heads 

Essential 


internal peptides 

small 

peptides 



23 

Precursor of major head subunit 

56.0-- 

No or faulty heads; go/ mutations in 

Essential; Gol peptide 



48.7--43 

gene 23 allow growth in lit hosts 

together with E. coli 




(CTR5x) 

Lit, cleaves host EF Tu 

segD 

Probable site-specific intron-like DNA 
endonuclease 

25.6 


Nonessential 

O 

II 

rsj 

Precursor of head vertex subunit 

47.0--46, 

No or faulty heads, osmotic shock 

Essential; by-passed 



48.4? 

resistance 

by certain gene 23 
mutations 

rnlB = 24. 7 

Second RNA ligase 

37.6 


? 

hoc = eph 

Large outer capsid protein 

40.4 

Unstable capsids 

Auxiliary 

inh = lip 

Minor capsid protein; inhibitor of 
gp21 protease 

25.6 


Auxiliary 

segE 

Probable site-specific intron-like 

DNA endonuclease 

22.9 


Nonessential 

uvsW = dar 

RNA-DNA- and DNA-helicase; 

67.5 

UV-sensitive; fail to unwind R-loops; 

Auxiliary 


DNA-dependent ATPase 


suppress T4 59 uvsX, uvsY, and 46 
mutations 


uvsY=fdsB 

ssDNA binding, recombination and 

15.8 

UV-sensitive; recombination- 

Auxiliary 


repair protein; helper of UvsX, inhibitor of 


deficient; repair-deficient, 



endo VII 


DNA arrest; suppress T4 49 
mutations 


or/F= oriuvsY 

cis acting sequences in genes uvsY, uvsY.-l 
and uvsY.-2; primer transcript same as 
uvsY, uvsY.-l and uvsY.-2 transcript 


No DNA replication from oriF 

Auxiliary 

25 

Baseplate wedge subunit 

15.1 

Defective tails 

Essential 

26 

Baseplate hub subunit 

23.9 

Defective tails 

Essential 

26' 

Internal in-frame translation initiation 

12 


? 

26" 

Internal out-of-frame translation initiation 

10 


? 

57 

Baseplate hub assembly catalyst? 

29.3 

Defective tails 

Essential 

27 

Baseplate hub subunit 

44.5 

Defective tails; permit plating 
of fiberless phage 

Essential 

28 

Baseplate distal hub subunit 

17.3 

Defective tails 

Essential 

29 

Baseplate hub; determinant of tail length 

64.4 

Defective tails 

Essential 

48 

Baseplate; tail tube associated 

39.7 

Defective tails 

Essential 

54 

Baseplate-tail tube initiator 

35.0 

Defective tails 

Essential 
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Size 


Restrictive host 

Gene 

Function of gene product 

(l<Da) 

Mutant phenotype 

or condition 

alt 

Adenosylribosyltransferase 

75.8 

Synthetic defective with modA 

Auxiliary 


(packaged and injected with DNA) 


and modB deletions 


30 = lig 

DNA ligase 

55.3 

DNA arrest; hyper-recombination 

Essential; can be 
by-passed by functioning 
host ligase, when T4 
rll is defective 

rill 

unknown 

9.3 

Rapid lysis 

Auxiliary 

31 

Co-chaperonin for CroEL 

12.1 

Head assembly; gp23 forms 
lumps; T4 topoisomerase is 
defective 

Essential 

cd 

dCMP deaminase 

21.2 


Auxiliary 

pseT 

Deoxyribonucleotide 3' phosphatase, 

5' polynucleotide kinase 

34.6 


Auxiliary; CTr5x (lit ) 

alc=unf 

RNA polymerase- and DNA-binding 

19.0 

Allow transcript elongation on 

E. coll (pR386) 


protein; transcription terminator on 


C-DNA; no unfolding of host 



dC-DNA 


nucleoid 


rnlA = 63 

RNA ligase; catalyst of tail fiber 
attachment 

43.5 

Defective tail fiber attachment 

Auxiliary 

denA 

Endonuclease il that restricts 

16.7 

Defective in host DNA degradation 

Auxiliary; restricted 


dC-containing DNA 



in E. coli B rpoB5081 

nrdB 

Ribonucleotide reductase p subunit 

45.3 

Reduced DNA synthesis 

Auxiliary; nrd- 


(contains intron) 



defective hosts 

l-Tevlll 

Defective intron homing endonuclease 

11.3 


Nonessential 

mobE 

Putative mobile endonuclease 

16.5 


Nonessential 

nrdA 

Ribonucleotide reductase a-subunit 

86 

Reduced DNA synthesis 

Auxiliary; nrd- 
defective hosts 

td 

Thymidylate synthetase (contains intron) 

33.1 

Reduced DNA synthesis 

Auxiliary; td-defective 
hosts 

I-Tevl 

Intron homing endonuclease 

28.2 


Auxiliary 

frd 

Dihydrofolate reductase 

21.7 

Reduced DNA synthesis 

Auxiliary 

32 

ssDNA binding protein, scaffold of DNA 

33.5 

DNA arrest, UV-sensitive, 

Essential; Tab32for ts 


replication, recombination and 


recombination and excision 

mutants; 32 am 


DNA-precursor-synthesizing protein 


repair deficient 

mutations 


machines 



in ochre-suppressor- 
containing hosts are 
suppressed by dda 
mutations 

segC = 32.1 

Site-specific DNA endonuclease, leading 
to localized gene conversion, exclusion 

24.6 


Auxiliary 

59 

Loader of gene 41 DNA helicase, ssDNA 

26.0 

Fail to load gp41 helicase onto 

Almost essential 


binding protein 


recombination intermediates, or 
ssDNA covered with gp32 or UvsX 
protein; DNA arrest 


33 

Protein connecting gp45 and gp55, to 
allow transcription by RNA polymerase 
from late promoters 

12.8 

No late RNA synthesis 

Essential 

dsbA 

Double-stranded DNA binding protein 

10.4 

Facilitates some late RNA synthesis 

Auxiliary 

rnh = das 

RNaseH; 5' to 3' DNase; yeast 

35.6 

Defective processing of Okazaki 

Auxiliary 


FEN homolog 


fragments; das mutations suppress 
T4 46, 47 and uvsX mutations 


34 

Proximal tail fiber subunit 

140.4 

Fiberless particles 

Essential 

orIC = or!34 

Primer transcript in opposite orientation 
of 34 transcript 



Auxiliary 

35 

Tail fiber hinge 

40.1 

Fiberless particles 

Essential 

36 

Small distal tail fiber subunit 

23.3 

Fiberless particles 

Essential 

37 

Large distal tail fiber subunit 

109.2 

Fiberless particles, host range 

Essential 

38 

Assembly catalyst of distal tail fiber 

22.3 

Fiberless particles 

Essential 

t = rV~stH 

Holin, inner membrane pore protein, 

25.2 

Affect lysis by e lysozyme; suppress 

Essential 


affects lysis timing and inhibition 


T4 rll and 63 mutations 



(Continued) 


234 



T4 AND RELATED PHAGES: STRUCTURE AND DEVELOPMENT 235 


Table 18-1 Continued 




Size 


Restrictive host 

Gene 

Function of gene product 

(kDa) 

Mutant phenotype 

or condition 

asiA 

Protein that binds to host a 70 , inhibits 

10.6 

Defective middle mode, and 

Almost essential 


interaction with -35 regions of classical 
promoters, and facilitates interaction 
with T4 MotA protein 


(indirectly) late transcription 


arn 

Inhibitor of MrcBC restriction nuclease 

10.9 


Auxiliary 

mo tA = sip 

Activator of middle promoters; 

23.6 

Defective middle mode transcription; 

Almost essential 


dsDNA binding protein specific for 


suppress r//-defects in X lysogens; 



mot boxes 


affects interaction with a 70 and AsiA 


52 

DNA topoisomerase subunit; 

50.6 

DNA delay 

Temperatures below 25 °C; 


membrane-associated protein 



inhibition of host 
topoisomerase IV with 
novobiocin 

ac 

Membrane protein 

5.5 

Acriflavine resistant 

Auxiliary 

omo~ rs 


5.4 

Acriflavine resistant 

Auxiliary 

stp 

Peptide modulating host restriction 
system 

3.7 

Suppress pseT mutations 

Auxiliary 

ndd= D2b 

Protein that disrupts host nucleoid; 
binds to host HU 

16.9 

Nucleoid disruption defective 

Auxiliary; CT447 

pla262 

Unknown 



CT262 

denB 

Endonuclease IV, single-strand specific 

21.2 

Allows progeny production of T4 

Auxiliary 


endonuclease 


with dC-DNA 


rllB 

Membrane-associated protein; 

35.5 

Rapid lysis; suppresses T4 30 

rex + /. lysogens; P2-like 


affects host membrane ATPase 


and some 32 mutations 

HK239 lysogen; tabR 


Genes are listed by the currently used names, followed by alternative designations in the literature. Gene products processed into smaller peptides are indicated 
(--) with the sizes or size range following the principal product. Because the distinction between “essential” and “nonessential” is not obvious, when the 
mutants were not tested under all possible conditions or in all possible hosts, “nonessential” genes are noted as auxiliary. Where known, restrictive hosts or 
plating conditions for several mutant genes are noted. 


What is the meaning of these complexities and redun¬ 
dancies? It is useful to think about them in terms of phage 
evolution (153) (chapter 4). The T4 genome is assembled 
by mixing and matching different components, entire 
genes and gene segments, from the genomes of other 
phages, plasmids, or hosts, and evolutionary pressure is still 
at work to coordinate these components into a functional 
unit. 


Virion Structure and Initiation 
of Infection 

Phage T4 devotes a large percentage of its genes to the 
synthesis and assembly of structural components, from 
which the complex virion is assembled. Figure 18-3 shows a 
diagram of theT4 particle derived from negative stain, cryo- 
electron microscopy and X-ray diffraction, and lists the 
structural components of the phage. Twenty-four genes 
control head morphogenesis. The head itself contains at 
least 10 different polypeptides (figure 18-3), while other T4 
genes control prohead initiation (gp20, gp40), protein fold¬ 
ing (gp31), scaffold assembly (gp22, gp67, gp68, lip = inh), 
and proteolytic processing (gp21), DNA packaging and 
protection (gpl6, gpl7; gp2) and head completion (gpl3, 
gpl4, gp4). Gp wac is fibritin, from which the whiskers 
are assembled. 


The tail and fibers contain 26 structural proteins, and five 
other T4 gene products are needed as assembly catalysts or 
chaperones: gp51, gp63, gp57A, gp31, and gp38. 

The long tail fibers, attached to the baseplate, and fibritin 
(wac) “whiskers" near the head serve as sensors to detect 
the presence of a bacterial cell. The baseplate-tail complex 
is a molecular machine designed to deliver DNA into the 
cell. Like a cocked mechanical toy, it stores energy in the 
form of constrained protein conformations. 

T4 absorption and infection occur in several steps. 
Before the T4 infectious cycle begins, the long tail fibers are 
released from their stored position at the whiskers. This 
permits the thin tips of the long fibers to interact reversibly 
with the cell surface, at diglucosyl residues in the outer 
membrane in E. coli B lipopolysaccharide (LPS) (135) or 
at the OmpC protein in E. coli K-12. Phage T4 uses gp37 to 
attach to the host’s cell surface, while phage T2 and several 
other T-even phages use gp38. In phageT2, fiber attachment 
is to the bacterial surface protein OmpF, while phage K3 
uses the OmpA protein (156,436). 

The long tail fibers, while binding reversibly to the 
outer membrane receptors, move over the cell surface 
and transmit conformational information to the baseplate 
via gp9, which normally keeps the baseplate in a con¬ 
strained position (84). Binding by at least three long tail 
fibers converts the baseplate from a hexagon into an 
expanded, star-like structure that releases the gpl2 short 
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tail fibers from their stored and folded position. This leads to 
irreversible binding of the virus to the host cell by contact of 
the gpl2 short tail fibers with a surface receptor, using 
heptose residues in the E. coli LPS core. The short tail fibers 
of all T-even phages bind to surface LPS but not to the 
outer membrane proteins. Sheath contraction is also regu¬ 
lated in part by the long tail fibers, since fiberless phages 
are more resistant to chemically induced sheath contrac¬ 
tion than are phages with the normal complement of long 
tail fibers. 

Several models have been described for the nature of 
the protein interactions during sheath contraction (107). 
The “induced conformational change” model of (195) 
proposes that sheath extension is maintained by sheath- 
tube subunit interactions, and that there is no tension 
along the sheath. Contraction is triggered by a wave of 
conformational change along the tube that releases the 
gpl8-gpl9 interactions, and sheath contraction follows 
this wave. This model incorporates Moody and Markowski’s 
(268) observations of partially contracted sheaths and the 
artificial crosslinking of sheath to tube by formaldehyde 
treatment. A combination of models, with conformational 
energy supplied by gpl8-gpl9 interactions and initiation 
by release of gpl8-gpl9 interactions, has been presented by 
Caspar (70). This model shows in detail the steps in the 
contraction of a mechanical device that illustrates clearly 
the structural relationships during the contraction process. 

A needlelike structure centered under the baseplate, 
composed of gp27 and gp5, has been proposed to initiate 
penetration of the cell envelope (187). Gp5, which is cleaved 
during this process, contains a lysozyme domain that may 
locally digest the cellular peptidoglycan, preparing the 
way for DNA injection (292). The needle must then move 
aside or detach to allow penetration of the cell wall by the 
tail tube. The force required for penetration is associated 
with contraction of the tail sheath. 

The last step in infection, DNA transfer, remains poorly 
characterized, but most likely involves a transmembrane 
channel, evidenced by ion leakage from host cells upon 
infection (135). A putative channel may pre-exist in the 
membrane, or form from host proteins, phage proteins, or 
a combination of both. Its function requires membrane 
potential (135). 

Alternatively, it has been proposed that the tail tube 
itself acts as a DNA conduit directly into the host cell. 
Studies by Furukawa et al. (124,125) and Tarahovsky et al. 
(395) suggest that the tail tube penetrates the outer mem¬ 
brane by mechanical force generated by sheath contraction, 
then induces localized inner membrane fusion around the 
tip of the tail tube. 

Several virion proteins are ejected with the DNA. 
The gp29 “tape measure” protein, located in the tail tube 
channel, must exit before DNA can enter. Gp2 protects the 
ends of the phage DNA from exonuclease V (370). The 
ADP ribosylating protein, gp alt, may influence the earliest 


stage of T4 transcription (206). The cleaved internal 
proteins are also injected, but their intracellular roles are 
unknown (50). 

Complete transfer of phage DNA and proteins into a host 
cell requires energy supplied by the membrane-potential 
component of the proton-motive force (135). However, 
membrane potential is not the main driving force of DNA 
transfer, since artificially contracted phage can transfer 
their DNA into de-energized spheroplasts (124). It is 
proposed that the role of membrane potential is to provide 
energy to bring the outer and inner membranes closer 
together to facilitate interaction of the phage tail with the 
inner membrane, and that energy regulates, but does not 
drive DNA transfer. 

Temporally Controlled Transcription 

The Hershey-Chase experiment established unambiguously 
that T-even DNA is sufficient to establish viral growth (163). 
Inactivation of host functions and temporal regulation of 
expression of different classes of phage genes are then 
exerted at several levels: transcript initiation, elongation, 
termination, and stability; translational controls, and 
combinations thereof (227, 261, 291). 

In contrast to the T7- or N4-related phages, the T-evens 
use the host’s core RNA polymerase throughout their life 
cycle. Differential transcription initiation from T4 early, 
middle or late promoters is due, in part, to a cascade of RNA 
polymerase modifications: several T4 proteins (table 18-1) 
associate noncovalently with the core RNA polymerase, 
and the a-subunits are covalently ADP-ribosylated. Several 
of these proteins appear “nonessential” under standard 
laboratory conditions, suggesting that the transitions from 
host gene expression to different stages of T4 development 
occur in many small and redundant steps (288, 209). 

One of the earliest synthesized T4 proteins, Ale, binds 
to RNA polymerase and DNA, and selectively terminates 
transcription from C-containing (host), but not HMC- 
containing (T4) DNA (98, 194, 376). This causes unfolding 
of host DNA, explaining the alternative name unf for the 
corresponding gene (161). The inactivation of host tran¬ 
scription is subsequently reinforced by other modifications 
of the host’s RNA polymerase. 

The developmental regulation of T4 gene expression 
more closely resembles a web of interacting regulatory 
networks than a cascade of linear timed pathways. This 
has led to often confusing nomenclature. Different classes 
of phage genes can be distinguished as early (also called 
immediate early, IE), middle (also called delayed early, DE), 
or late. Genes that are transcribed both early and late have 
also been called quasi-late. 

Another classification criterion distinguishes all genes 
that are expressed prior to the onset of DNA replication 
(“prereplicative” or “early”) from “postreplicative” or “late” 
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genes, whose expression depends on DNA replication, 
specifically the sliding clamp gp45, on a T4-encoded sigma 
factor (gp55) and on a mediator protein, gp33. 

Early Transcription 

Operationally, early (IE) genes are defined as being 
transcribed by host RNA polymerase when phage gene 
expression is prevented by inhibitors of protein synthesis 
(e.g., chloramphenicol). Under these conditions the host 
RNA polymerase uses unmodified ct 70 to recognize early 
T4 promoters, but the core RNA polymerase may be altered 
by T4 Alt protein that is injected with the phage DNA. Alt 
protein ADP-ribosylates one a-subunit at Arg265. This 
amino acid is important for recognition of so-called UP 
elements in promoters for ribosomal RNA (343), for several 
activator proteins of the host and of other phages (174), 
and for the dimerization of the C-terminal domains of RNA 
polymerase a-subunits (62, 126). Such ADP-ribosylation is 
beneficial but not essential for T4 growth under optimal 
laboratory conditions. 

Most T4 early promoters resemble the consensus 
sequence of the major E. coli promoters, but they have 
extended —10 regions, different —35 regions, and additional 
information content (430, 431) (figure 18-4). Many of them 
have upstream poly (A) or poly (T) tracts, which may 
enhance transcription as bendable sequences or as UP 
elements or both. At least three exceptional early T4 pro¬ 
moters (P 57 , P bac , and P rep E) more closely resemble the 
major E. coli promoters (162, 239,407). 

Several factors, none of them essential for T4 growth, 
have been proposed to contribute to the preferential tran¬ 
scription from T4’s early promoters when E. coli promoters 
are not yet degraded (227). 

1. Most importantly, the early T4 Ale protein, once it is 
made, inhibits all transcript elongation on (cytosine- 
containing) host DNA. 

2. ADP-ribosylation of Arg 2h ’ of one a-subunit of the 
host's RNA polymerase by the T4 Alt protein and by 
the early T4 ModA protein differentially affects tran¬ 
scription from various T4 or E. coli promoters (430, 
431). 

3. At least two small T4 proteins (Srh and Srd) share 
interaction sites of two E. coli sigma factors with the 
core RNA polymerase. They might compete with 
them like decoys (288). 

4. At the time of infection, the host DNA is associated 
with nonspecific (e.g., HU, H-NS) or semispecific (e.g., 
IHF, FIS) DNA binding proteins. In contrast, the infect¬ 
ing phage DNA is at first largely free of proteins and 
may be more readily accessible to the host’s RNA poly¬ 
merase (227). 

5. The early T4 AsiA protein binds to the C-terminal 
segment of o /0 (166, 230, 314, 356), interfering with 


transcription from host and phage promoters contain¬ 
ing consensus —35 regions (5,6, 77, 209, 313). 

Later on, the degradation of host DNA by several early 
T4 proteins (66, 229, 375) eliminates all host transcription. 

Middle Transcription 

In contrast to early (IE) genes, expression of middle (and 
late) T4 genes requires phage-directed protein synthesis for 
several reasons. Transcription from middle promoters 
requires prior synthesis of two early T4 proteins: AsiA 
and MotA. Alternatively, downstream genes of tran¬ 
scription units initiated from early promoters may require 
antitermination or stabilization of rare long transcripts 
(383-385). The distinction between different classes is 
blurred, however, because many T4 genes are under dual 
or multiple transcriptional controls, and transcription of 
middle (DE) genes may be delayed for several reasons. 
Moreover, many late T4 genes are transcribed from late 
promoters but are also cotranscribed with early genes 
from early or middle promoters. For these late genes, post- 
transcriptional mechanisms (discussed below) prevent 
expression at early times. 

The AsiA protein (77, 166-168, 382) inhibits host 
transcription, because it binds to region 4 of o AI (265, 316, 
355, 356) and prevents transcription of host promoters with 
“standard” —35 regions. It has been postulated that AsiA 
also turns off early T4 promoters. However, experiments 
using asiA deletions indicate that other, yet unknown fac¬ 
tors are important for the shutoff of most early T4 promoters 
(325), which have nonstandard —35 regions. 

AsiA is also an activator of T4 middle promoters (166). 
At middle promoters the a 70 subunit, complexed with AsiA 
protein and core RNA polymerase, recognizes consensus 
— 10 regions, and the AsiA protein interacts with T4 
MotA protein (255) bound to consensus DNA sequences 
centered at —30 positions, called mot A boxes (166, 251, 
383) (figure 18-4). AsiA is a dimer in solution, but a mono¬ 
mer when associated with ct 70 . The dimer interface of one 
monomer is alternatively used to contact ct 70 (230). Two 
sequenced T4-related phages, RB49 (87) and KVP 40 (262), 
apparently lack the middle mode transcription. 

Collectively, prereplicative genes encode: (i) nucleases 
that degrade the host DNA; (ii) enzymes of the deoxyribonu- 
cleotide biosynthesis complex; (iii) proteins of the replication 
and recombination machines; (iv) proteins that modify the 
T4 DNA to protect it from degradation by its own nucleases 
and from other restriction enzymes; (v) several tRNAs, 
which are processed from precursor RNAs and supplement 
host tRNAs during translation; (vi) proteins that modify 
structure and function of the host RNA polymerase; (vii) at 
least two RNases: RegB protein that selectively destroys cer¬ 
tain early and probably host transcripts, and RnaseH, that 
degrades RNA primers (307); (viii) a differential modulator 
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of late RNA degradation (Dmd) (184-186): and (ix) a tran¬ 
slational repressor (RegA protein) (261, 429). In addition, 
some prereplicative transcripts serve as primers for leading- 
strand DNA synthesis in origins of replication (see below). 

Late Transcription 

Transcription from late promoters requires concomitant 
DNA replication and the replacement of a /(l with a phage- 
encoded sigma factor, gp55, the product of a middle 
gene (432). Initiation from late promoters (129) (figure 18-4) 
also requires the adapter protein, gp33, and the sliding 
clamp of the replisome, gp45, which is loaded onto the DNA 
at single-strand interruptions by a gp44-gp62 clamp loader 
complex (158-160, 349, 350) (see also “DNA Replication” 
below). This sliding clamp tracks along DNA and 
allows RNA polymerase, bound at late promoters, to form 
open complexes, thereby coupling late T4 transcription 
to DNA replication. The a-subunits of the core poly¬ 
merase play distinct additional roles in recognition of late 
promoters (402). In certain recombination-and-ligase- 
defective mutants, late transcription can occur without 
DNA replication, because single-strand interruptions in the 
DNA provide entry sites for the sliding clamp (129). 

Genes downstream of any promoter may be poorly 
expressed because of transcription termination, because 
of RNA processing or degradation, or because ribosome 
binding sites are sequestered by secondary structures. 
Certain E. coli rho (transcription termination factor) 
mutations (also called host defective, hdF or nusD ) prevent 
growth of wild-type T4 by causing premature termination 
of many T4 transcripts (383, 384). Mutations that allow 
growth in these rho mutants have been found in three 
nonessential T4 genes. It was initially thought that these 
encode transcriptional antiterminators, but current evi¬ 
dence suggests that at least one of them, goFl, allows 
T4 growth in the (nusD) rho mutants, because it stabilizes 
the few functional transcripts that have not been pre¬ 
maturely terminated. Plasmid-encoded Rop protein has 
a similar stabilizing effect, and pBR322 plasmids allow 
T4 growth in these rho mutant hosts (379,422). 

The transitions from early to middle to late transcription 
are influenced by many factors, each individually having 
only minor effects (209). DNA topoisomerase mutations 
affecting torsional stress in the DNA specifically affect late 
transcription patterns (240). Moreover, a DNA binding 
protein (DsbA) may affect transcription from some, but not 
all, late promoters (146). Several nonessential T4 proteins 
affect the binding of minor host ct factors to core RNA 
polymerase, preventing competition with ct /() , the only 
host sigma factor that is essential for T4 development. 
For example, the Mrh protein modulates phosphorylation 
of the host's cr 32 , which affects T4 late transcrption at 42 °C 
(121, 122, 288), and the Srh and Srd proteins are proposed 
to be decoys of the host’s ct j2 and a 70 , respectively (288). 


The late genes code for virion components, for some DNA 
repair and recombination proteins, and for proteins that 
cut and package the complex vegetative DNA into preformed 
heads. A soluble lysozyme (gp e) lyses the cell wall of 
host bacteria to release the progeny phage particles. It is 
evolutionarily related to a baseplate lysozyme (gp 5) that 
attacks the cell wall from the outside during infection 
(81,189, 292). The gp e lysozyme gains access to its substrate 
in the outer cell wall through a protein channel that forms 
from the holin (gp t), when the membrane potential breaks 
down. In T4 this process is regulated, in part, in response 
to superinfection, sensed by the gp rl (1, 315, 329, 330). 

Post-Transcriptional Controls 

As with other phages, post-transcriptional mechanisms 
modulate expression of T4 genes in a variety of ways: (i) by 
modulating translation initiation via potential secondary 
RNA structures and/or by coupling translation of two or 
more genes, (ii) by translational repression, (iii) by proteins 
affecting ribosome structure and function, (iv) by processing 
of primary transcripts, and (v) by the rate of degradation. 
In addition, expression of at least four T4 genes requires 
splicing, or in a variation so far unique among phages, 
skipping of segments of primary transcripts. 

RNA Structures 

Due to the interspersion of early and late genes on the 
T4 map (figure 18-2), many early and middle transcripts are 
extended into late genes (figure 18-4). Nevertheless, few 
corresponding late proteins are synthesized early at 
temperatures below 37 °C. It is remarkable that expression 
of at least 10 such late genes investigated so far follows 
similar patterns: in the long early transcripts a hairpin 
sequesters the translation initiation region, either the 
Shine-Dalgarno sequence or the initiation codon, or both. 
A late promoter immediately upstream of the late gene(s) 
directs synthesis of transcripts that cannot form the hair¬ 
pin: these transcripts are efficiently translated (figure 18-4). 
Expression of gene 49 coding for endonuclease VII (18) 
is one example. These late genes can be translated early 
in vivo at high temperatures, or in vitro when the RNA is 
broken, because in broken early RNA some sequestered 
late ribosome binding sites become accessible. We specu¬ 
late that this pattern may have selective advantage because 
if RNA polymerase pauses at certain hairpins formed within 
early transcripts, then transcription initiation from an 
adjacent late promoter might be facilitated. 

Translational coupling can coordinate synthesis of 
proteins encoded in multicistronic RNAs. It has been 
implicated inT4 gene expression early on (380). It can serve 
to coordinate expression of proteins involved in the same 
pathway (e.g., the tail fibers gp34 and gp35, or subunits of 
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the terminase). Translation of the terminase subunit gpl6 
may disrupt a hairpin that otherwise prevents translation 
initiation of the overlapping gpl7 (119). Translational 
coupling can also adjust the synthesis of proteins that form 
heteromeric complexes. For example, the pentameric clamp 
loader complex contains four gp44s and one gp62 (133, 
403, 440). This ratio is regulated by translational coupling. 
In addition, coupling might interconnect different physio¬ 
logical processes, for example coupled translation of vs 
(modifier of valine synthetase) and tk (thymidine kinase) 
might allow communication between translation and DNA 
metabolism. 

Translational Repressors 

Three regulatory systems of T4 depend on translational 
repression that is autogenously regulated. Gene 32 codes 
for the major single-stranded DNA binding protein (SSB) 
involved in DNA replication, recombination and repair (10). 
Its expression is controlled by a pseudoknot upstream of 
the ribosome binding site and adjacent A-rich repeats (257, 
261, 360), by cooperative binding to single-stranded (ss) 
DNA and RNA (412, 415), which is stronger to ssDNA than 
to RNA. Binding of gp32 to its mRNA is nucleated 
by preferential binding to the pseudoknot. Cooperative 
binding of gp32 to the adjacent unstructured leader 
transcript eventually obliterates the ribosome binding site 
and prevents more synthesis of gp32 when, and only when, 
there is excessive gp32 in infected cells (99, 220, 222, 257, 
327, 360-362, 415). This beautiful, intricate system 
allows protection of all ssDNA without preventing general 
translation from most rnRNAs under many different 
growth conditions. 

Gene 43, encoding DNA polymerase, is similarly auto- 
regulated (222). A stem-loop in the leader RNA binds exces¬ 
sive DNA polymerase, thereby obliterating the adjacent 
ribosome binding site (12,261, 323, 324,418). 

A more general, nonessential translational repressor, 
RegA protein, binds to several specific transcripts, reduc¬ 
ing translation of several T4 replication proteins and of 
some host proteins. This repression is most apparent under 
experimental conditions that prolong early transcription 
(429). The RegA protein binds to AU-rich regions of several 
leader regions, but a consensus sequence is not apparent. 
The crystal structure of this protein is known (188), 
and interactions with target RNA are inferred from muta¬ 
tional studies (3, 4, 139, 238, 261, 263, 312, 326, 354, 404, 
405,424,434). 

T4-Evoked Modifications of Ribosomes 

Translation of host messages ceases immediately after 
T4 infection. Mysteriously, it is inhibited by infection with 
DNA-free particles, so-called ghosts (100). It is thought 
that some cells can recover from this immediate turn 


off, but in successfully infected cells, ribosomes have 
already been programmed to translate T4 messages exclu¬ 
sively. A few host membrane proteins are the notable 
exception. There is extensive, controversial literature, 
reviewed with great fairness (428), concerning to what 
extent T4 infection alters the host's ribosomes. On balance, 
it appears that ribosomes from T4-infected cells differ 
mainly in the content of their SI protein. This protein is 
one of several T4 proteins that are ADP-ribosylated by the 
T4 modB protein (Wolfgang Riiger, personal communi¬ 
cation). It is conceivable that this modification lowers 
the affinity of SI for ribosomes, and that ADP-ribosylation 
by the related Alt protein, which is packaged into 
phage particles, is responsible for the ghost effect men¬ 
tioned above. It is suspected but not yet shown that ribo¬ 
somes acquire a few phage-encoded proteins after 
infection (227). 

RNA Processing and Degradation 

Host RNases, especially the RNase E complex (69, 144, 
298, 410) now called the “degradosome” (see (391) for 
review) and Rnase III (75, 88), contribute to processing of 
T4-encoded tRNAs (352) and to degradation of T4 
mRNAs (69, 281, 291). In at least one case RNase III 
cleavage activates translation from an internal ribosome 
binding site to produce a truncated terminase protein, 
gpl7"' (119). 

Selective degradation of early T4 transcripts at late 
times is directed by at least two mechanisms. A T4 regB- 
encoded ribonuclease cleaves early transcripts in their 
ribosome binding sites (72, 346, 347, 351, 406). Other yet 
unknown nucleases, indirectly controlled by MotA, are 
thought to selectively degrade early transcripts. They 
spare late transcripts because the T4 dm cl gene, among 
others, directly or indirectly protects late transcripts 
(184-186). 

RNA processing is of utmost importance for the final 
activities of eight T4-encoded tRNAs (reviewed by (352) 
and (281)). Although these tRNAs are nonessential under 
standard laboratory conditions, they are essential under 
other conditions and are thought to enhance efficient syn¬ 
thesis of most T4 gene products. The codons corresponding 
to the T4 tRNAs are used significantly more frequently in 
T4 than in E. coli. The T4 tRNA region can be transcribed 
from several promoters (56) as large transcripts contain¬ 
ing all genes and from several internal promoters. The 
stepwise processing yielding eight tRNAs and two small 
RNAs of unknown function is accomplished by host 
RNases, to some extent by autocatalysis, and by a nucleotidyl 
transferase that adds CAA to some of the products (88). At 
least one nonessential T4 gene, mb = Ml=cef is impor¬ 
tant for processing of three tRNAs. This gene becomes 
essential when a host gene, probably coding for a tRNA, is 
defective (340). 
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Introns 

The genes td, nrdB, nrdD, and 60 are interrupted by interven¬ 
ing untranslated sequences (24, 171, 369). The first three of 
these introns can self excise (24, 369). They form three- 
dimensional structures related to those of other type I 
introns (24, 369). The td and nrdD introns encode endonu¬ 
cleases of the GIY-YIG type, wheras the nrdB intron encodes 
a defective endonuclease of the H-N-H type. In the related 
RB5 phage this intron endonuclease is complete. These 
endonucleases are important for "homing” of the introns 
(24). Free-standing, related endonucleases segA through 
segF (363) and mobA through mobE (262) exist in the 
T4 genome, but are lacking in many related genomes. The 
Seg endonucleases (and probably the Mob enzymes as well) 
introduce sequence specific DNA breaks (28, 182, 183, 364, 
365) whose recombinational repair leads to exclusion 
and gene conversion to alleles of the unbroken chromosome 
of markers close to the break (24, 28, 290, 300, 365) (G. 
Mosig, unpublished data). Actually, the SegF endonucleae 
appears to be a chimera of Seg and Mob-like endonucleases 
(128, 262). 

In contrast, the intron in gene 60 is not excised but 
is skipped during translation, probably because it can fold 
into a superstable hairpin that is skipped by the ribosome 
(134,171). 

Protein Degradation 

In host strains containing the el4 element, a defective 
prophage, the host-encoded translation factor EF Tu is 
cleaved and functionally inactivated for translation of late 
T4 proteins (132, 377, 442). An N-terminal peptide of the 
major capsid protein gp23, called Gol, combines with the 
Lit protein, encoded by the host's el4 element, to cleave EF 
Tu at a specific site. Because gp23 is a late protein, only late 
translation is affected. 


Modification and Restriction of T4 DNA 

In the following discussion the term “restriction" is used in 
its broadest meaning and is not limited to type II restriction 
enzymes. The complex modifications and restrictions of 
T4 DNA and of other DNA by T4 can be best rationalized 
as a result of ongoing evolution that is accelerated by 
strong selective forces and exchanges between the genomes 
of T4 and its host, including plasmid and prophage 
genomes that reside in the host. 

As mentioned above, T4 DNA contains HMC instead of 
C residues. This modification protects the T4 DNA against 
most foreign restriction enzymes and against T4-encoded 
restriction endonuclease II (gp denA). The latter enzyme, 
together with endonuclease IV (gp denB) and the gene 
46/47-controlled nuclease, degrade the host DNA as part of 


the parasitic strategy to usurp the host (66-68). Although 
HMC residues confer resistance to most restriction systems, 
they render DNA susceptible to the Mcr restriction systems 
of the host. These host functions were the first restriction 
enzymes discovered (247). Originally called rgl they are 
now called McrA and McrBC, because they restrict DNA 
containing methylcytosine or HMC. 

McrA of E. coli K-12, like the lit gene mentioned above, 
resides in the el4 element (17). Thus, in e!4-containing 
bacteria T4 is excluded at several levels. 

The mcrBC genes, probably acquired by a recent trans¬ 
position, are part of a cluster of restriction genes at 99 min 
on the E. coli K-12 map (68). The proteins assemble into 
hexameric GTP-dependent rings (317). A shorter version 
of McrB, McrB s , modulates McrB’s activity (318). 

These restriction functions are inactive when the HMC 
residues are glycosylated. In T4 DNA, all HMC residues are 
modified: 70% with a-, and 30% with |3-glucosyl linkages. 
In addition, an early T4 antirestriction protein (gp arn) 
protects nonglucosylated T4 DNA against McrBC but not 
other host restriction enzymes. 

T2, T4, and T6 encode a-glucosyl transferases, which 
modify HMC DNA after it has been polymerized. In T2 and 
T6 DNA, there are no |3-glucosyl transferases, and 25% of 
the HMC residues remain unglucosylated, probably because 
the a-glucosyl transferases cannot modify two adjacent 
HMC residues. T6 contains many di-glucosylated residues. 
The relevant enzyme is unknown. 

In addition,T2 andT4, but notT6, encode a Dam methyl- 
ase that methylates 0.5-1.5% of the adenine residues at the 
N h positions, mostly but not exclusively at GATC sequences. 
These enzymes exhibit patches of similar amino acids 
with the E. coli Dam methylase and the Dpnll methylase of 
Diplococcus pneumoniae. The only proven physiological role 
of adenine methylation is protection against the phage 
PI restriction system (see chapter 24), when there is no 
HMC glycosylation (68). 

The classical example of phage exclusions by genes of 
resident prophages is that of T4 rll mutants by the rex A 
and rexB genes of X lysogens. This exclusion was elegantly 
used in Benzer’s classical analyses of structure and function 
of a gene. It occurs at the time of transition from join- 
copy to join-cut-copy recombination, and it involves several 
enzymes important in the latter mechanism (279, 296). The 
molecular mechanism of this exclusion is unknown. It has 
been proposed that the RexB proteins form an ion channel 
that, when opened after infection with T4 rll mutants, 
leads to cessation of host metabolism (321). 

Another cryptic DNA element of certain E. coli strains, 
prr, encodes a protein that excludes T4 RNA ligase 
1/polynucleotide kinase deficient mutants. PrrC protein is a 
cryptic RNA endonuclease that is activated by the small 
(26 residues) T4 Stp protein to cleave the anticodon loop of 
an essential host tRNA Lys . T4 RNA ligase 1/polynucleotide 
kinase can repair this damage, but in the absence of this T4 
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protein the cleavage of this tRNA is lethal to T4 protein 
synthesis. Intriguingly, the prrC gene is located between 
three genes of a type IC restriction cassette. 

Phage P2 lysogens exclude T4 by two mechanisms: the 
P2 Tin protein poisons the single-stranded DNA binding 
protein gp32 that is essential for all T4 DNA replication and 
recombination (297), and the P2 Old protein can degrade 
T4 (as well as X) DNA from ends, nicks, and gaps (63), 
unless the ends are protected by certain proteins (gp2 in 
the case of T4; 14, 244,419). 

DNA Replication, Recombination, and 

Repair 

T4 codes for all the components of its own replication and 
recombination complexes, for an excision-repair enzyme, 
denV and for many enzymes that synthesize precursors for 
or modify T4 DNA. Some of the latter proteins (dCTPase, 
dCMP hydroxymethylase, deoxynucleotide kinase) specifi¬ 
cally allow incorporation of HMC triphosphates during T4 
DNA synthesis. The a and (3 DNA glucosylases sugar-coat 
the HMC-containing DNA after it has been polymerized, 
to protect it from attack by Mrc nucleases described above. 

The basic replisome proteins of all organisms share 
sequence and structural similarities, and some of the T4 
replication proteins can partially function in eukaryotic 
in vitro systems (307). 

Initiation of T4 DNA replication is far more complex than 
structure and function of the basic replisomes. As we discuss 
in detail below, replication forks can be initiated in several 
different ways. There is necessary, but limited initiation 
from bonafide origins: most subsequent replication forks 
are initiated from intermediates of homologous recombina¬ 
tion. There are several redundant origins of DNA replication, 
and there are several pathways of recombination-dependent 
DNA replication. We surmise that these redundant pathways 
are exquisitely suited to adjust DNA replication to different 
transcription patterns, described above, to environmental 
conditions and to packaging of DNA described below. 

The Basic Replisome 

The combination of genetic experimentation and virtuoso 
biochemical and biophysical characterization of replica¬ 
tion proteins has led to an understanding of functions and 
interactions of replication proteins in the basic replisome, 
a biological machine that moves the replication fork 
(or through which replicating DNA is passaged) (307, 310) 
(figure 18-5). 

SevenT4 proteins, corresponding to genes 43 (DNA poly¬ 
merase), 44 and 62 (sliding clamp loader), 45 (sliding clamp), 
41 (DNA helicase), 61 (primase to synthesize primers for 
Okazaki fragments), and 32 (SSB) together constitute the 
basic replisome, a biological machine (7, 8) that can move 


ssDNA binding (32) 



clamp-loader (44/62)) loading protein (59) 

Figure 18-5 The proteins of the DNA replication fork. From 
N. Nossal, with permission. 

the replication fork through model templates at in vivo 
speeds. To seal Okazaki fragments, RNase H has to remove 
the RNA primers (and adjacent DNA) and DNA polymerase 
has to fill the gaps so that DNA ligase can ligate the DNA 
fragments (29, 40, 74, 179, 180, 301, 309). Additionally, T4 
gp59 is essential for loading this helicase to recombina¬ 
tion junctions in vivo (286). It also helps load the gp41 
helicase to an origin in vitro (179, 309). In vivo host DNA 
polymerase I or host RNaseH can substitute for T4 RNase H 
(169), and E. coli ligase can substitute for T4 ligase, if T4 yll 
genes are mutated (34, 73, 191, 221). The three-dimensional 
structure of several of these proteins (or segments of them) 
(301, 302, 357-359) can now be correlated with defects 
defined by mutations. Assembly and interactions between 
these proteins are being elucidated by sensitive biophysical 
methods (29). 

In contrast to the E. coli replisome, the T4 replisome 
has not been isolated as a functional complex. Possibly, 
the T4 core replisome components interact less strongly 
than those of other organisms to facilitate interactions 
of DNA polymerase with different accessory replication 
and recombination proteins during different developmental 
stages as discussed below. Clearly, a real understanding 
of the intracellular activities of individual enzymes in 
such complexes requires a combination of mutational, 
biochemical, and biophysical analyses. 

This has been done exquisitely for the DNA polymerase 
of T4, and the related RB69 DNA polymerase (20, 120, 193, 
359, 420, 421). Mutator and antimutator mutations occur 
in domains that are important for polymerizing and proof¬ 
reading activities. These studies also revealed hinge 
regions that are important for switching the primer ter¬ 
minus between the polymerizing and proofreading site. 
All these contribute to the exceptionally high fidelity of 
this enzyme (95,97, 308,334-337). 

Similarly, mutations affecting the core of the SSB, 
gp32 (357,358), define some interactions with other proteins 
(279, 290). In addition, limited proteolysis of this protein 
has revealed the importance of the N-terminal segment for 
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cooperative binding to DNA and of the C-terminal segment 
for interactions with several recombination proteins (433). 

Some DNA replication genes are clustered, but others 
are scattered throughout the genome. Most are transcribed 
from both early and middle promoters. Coordination of 
T4 DNA replication with other processes is achieved by the 
assembly and disassembly of protein complexes that use 
stretches of DNA covered with SSB, gp32, as scaffolds. 
These interactions have been characterized in vivo and 
in vitro (115,270, 274,283,284,296,427). 

In wild-type T4, leading- and lagging-strand syntheses 
are coupled, but they can be uncoupled both in vivo and 
in vitro (181, 293). Primase mutants are viable because 
at T4 origins RNA polymerase-dependent transcripts 
prime leading-strand DNA synthesis (246, 287), and priming 
of Okazaki fragments can be bypassed by recombina- 
tional mechanisms (290,293) discussed below. In topoisome- 
rase mutants, leading- and lagging-strand synthesis are 
also partially uncoupled (240, 401), and their replica¬ 
tion also depends on recombination (111, 173, 212, 279, 290, 
306,387). 

Initiation from Origins 

Several origins of replication have been detected under 
various conditions (147, 202, 215, 216, 240, 248, 252, 282, 
385,439): for reviews see (218, 287). In infecting T4 chromo¬ 
somes, the first round of DNA replication is initiated from 
only one of these origins (85, 86), perhaps because at 
the time of origin initiation there are limited supplies of 
replisome components. Because of the circular permutation 
of chromosomal ends (indicated in figure 18-4), in each 
individual chromosome any origin is located at a different 
distance from the DNA termini. Four of these origins (A, E. 
F, and G in figure 18-2) have been closely investigated. 
Each one has a different sequence. Three origins (A, F, 
and G) require transcription from middle promoters (143, 
218, 248, 286, 287), whereas oriE uses an early promoter 
(287, 407). For simplicity, figure 18-4 depicts initiation from 
an arbitrary generic origin. Transcripts initiated from 
the origin promoters serve as primers for leading-strand 
DNA synthesis. These transcripts also code for proteins, 
but obviously priming (requiring transcripts to reinvade 
the DNA) and use as mRNAs (requiring access of ribosomes) 
are mutually exclusive. 

In vivo, the transition from RNA primers to leading- 
strand DNA occurs at several sites more than lkb down¬ 
stream of the primer promoter (286, 287, 407), implying 
that the priming transcript has to reinvade the DNA, 
forming an R-loop. Reinvasion can be facilitated by global 
torsional stress in the DNA. Indeed, in vitro initiation 
from oriF has been achieved with an RNA primer that 
had been hybridized to supercoiled plasmid DNA (309). 
The 3' end of this RNA, located at the end of an unwound 
region, determined the priming site in vitro. This site is 


different from the predominant in vivo priming sites, 
determined by primer extension, near the transcription 
terminator between uvsY.-l and uvsW (287). 

In contrast, oriE (147, 211) can function when tor¬ 
sional stress is reduced (e.g., by excessive 32 P damage, by 
oxidative stress damage, or by mutation in the host gyrase 
gene: 240, 294). OriE function depends on a small protein, 
RepEB, and the auxiliary protein RepEA, both encoded by 
sisters of the primer transcript, and on at least one of five 
repeat sequences, so-called iterons upstream of the primer 
promoter (287, 407). The binding of RepEB to one or more 
of these iterons is required for oriE to function (150). It is 
postulated that RepEB binding to an iteron facilitates 
loading of a DNA helicase upstream of the primer promoter, 
and that unwinding of DNA behind the RNA polymerase 
by the helicase can compensate for the lack of global 
torsional stress in oriE. In some respects oriE resembles oriC 
of E. coli, in which leading-strand DNA synthesis can alter¬ 
natively be initiated from primase-dependent RNA or from 
primer transcripts (210). Less well characterized oriKs also 
depend on priming by transcripts (208). 

Several other T4 origins (202,439) have not been mapped 
precisely. Interestingly, an origin near gene 17 has been 
detected only in nonmodified (cytosine-containing) DNA 
(439), or when the host contained a mutation in the tran¬ 
scription terminator rho gene (385). Possibly altered tran¬ 
scription termination on cytosine-containing DNA, by T4 
gp ale or by certain rho mutations, may be important for 
activating this origin. 

The transition from a /(l -dependent prereplicative 
transcription to T4 cr gp "^-dependent late transcription inhi¬ 
bits initiation from these replication origins because 
the modified RNA polymerase no longer recognizes origin 
promoters (246, 407) and because a late protein, the UvsW 
RNA-DNA helicase, actively unwinds any persisting 
R-loops and prevents transcripts from serving as primers 
for DNA synthesis (65,102). 

Recombination and Recombination- 

Dependent DNA Replication 

The detection of homologous recombination has provided 
a powerful argument that phages are models for living 
systems (165). A strong connection between replication and 
recombination was suspected early on (413). This hypothesis 
fell temporarily out of favor when it was shown that in 
phage X viable recombinants and in T4 branched recom- 
binational intermediates can be formed in the absence of 
DNA replication (57-59, 258). However, the recombina¬ 
tion and segregation patterns at the ends of incomplete 
genomes (packaged in small T4 particles, see below) (276, 
289) implied and led to the first compelling demonstration 
of recombination-dependent DNA replication (246). The 
current view of these processes has been thoroughly 
reviewed (277, 279, 290). 
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As soon as the first replication forks have reached an end, 
recombination intermediates formed at these ends compete 
effectively for assembly of recombination-dependent repli- 
sornes by what we have called join-copy recombination/ 
replication (pathway If in figure 18-4). Because the ends 
correspond to random positions of the genetic map, origins 
are no longer preferentially replicated (85, 246). 

The terminal redundancies of individual T4 chromo¬ 
somes are sufficiently large (~ 5 kb) to allow recombination 
between replicated and unreplicated termini after infection 
of E. coli by a single T4 particle (280). In electron micro¬ 
graphs, the resulting structures appear like rolling circles 
or more complex “firewheels.” Firewheels as intermediates 
of T4 DNA replication had been proposed earlier, but they 
had been interpreted in terms of rolling circle initiation 
(425,426). However, later experiments showed that they are 
formed by multiple recombinational invasions (85). 

After late proteins have been synthesized, a late “join- 
cut-copy” recombination/replication pathway can operate 
(figure 18-4). These and additional pathways are discussed 
below. Keeping in mind that there are many ambiguities 
and caveats in defining recombination pathways based on 
participating enzymes or structures of recombining DNA, 
five different recombination/replication pathways can be 
distinguished (279). They are not rigorously separated, but 
contain some common steps, and depend on some common 
proteins that are mixed and matched with other proteins 
in different complexes directing different pathways. 

Obviously, the time when any pathway is active depends 
on expression of the genes required for that pathway. 
Because late gene expression depends on DNA replication, 
and because some proteins participate both in packaging 
of DNA and in recombination, all these DNA transac¬ 
tions are tightly interwoven (figure 18-4). Redundant alter¬ 
native modes of replication and recombination ensure 
that these processes work under many different conditions 
and during different stages of development (290, 293). 

The single-strand annealing pathway I (57, 58) was 
analyzed when DNA replication was prevented. Under this 
condition the infecting DNA has to be broken, or nicked, 
and ssDNA regions have to be generated by nucleases 
before complementary DNA strands from different parents 
can anneal. Therefore, recombinational intermediates 
appeared only late after infection, and no viable recombi¬ 
nants were produced. Under replication-proficient condi¬ 
tions recombinogenic ssDNA termini appear earlier, due to 
failure to initiate an Okazaki fragment at each terminus. If 
they anneal with a homologous region, replication can also 
be initiated from their 3' end (279) (figure 18-4). Gp32 can 
catalyze annealing, explaining the viability of all T4 uvsX 
mutants (341). The DNA branches generated by ssDNA 
annealing are obviously less complex than D-loop branches, 
and therefore their resolution may depend less on endo¬ 
nuclease VII (89), which is actually inhibited by UvsX and 
UvsY proteins (47). Both annealing and invasion are also 


important to initiate recombination in phage X and yeast 
(319, 381), and probably in all organisms. 

In the presence of UvsX and UvsY proteins, the unrepli¬ 
cated ssDNA termini invade double-stranded homologous 
DNA to form D-loops. T4 UvsX is a RecA homolog (264) and 
T4 UvsY is a hexameric recombination-replication mediator 
that facilitates loading of UvsX to ssDNA covered with gp32 
(22, 23, 51). Gp59, another mediator protein, facilitates load¬ 
ing of the gp41 helicase to recombination junctions, a 
process that is important to establishing replication forks 
and also to driving branch migration of recombining DNA 
(179, 180, 301, 348). Replication initiated from the invading 
DNA of D-loops, join-copy replication/recombination (246), 
is depicted as pathway II in figure 18-4. It occurs early 
because it requires only prereplicativeT4 proteins. 

In contrast, late recombination pathway III (figure 18-4) 
requires, in addition, endonuclease VII or terminase (290, 
293). When these enzymes make a single cut in the 
invaded strand of the D-loop, the 3' end can initiate replica¬ 
tion of that parent whose ends invaded the D-loop. Because 
endoVII and terminase are synthesized predominantly 
late, pathway III is a late pathway. We have dubbed this 
pathway “join-cut-copy” (277, 290). It is essential in primase 
mutants (and important in topoisomerase mutants) so 
that the DNA strand that was displaced during origin- 
initiated replication can be copied in primase and topo¬ 
isomerase mutants (figure 18-4). Origin-initiation of these 
mutants starts at the same time as in wild-type T4, but 
recombination-dependent replication is delayed compared 
with wild-type T4, causing the so-called DNA-delay (DD) 
phenotype (279,290, 293). 

The classical double-strand DNA break repair pathway IV 
(393) results in only limited DNA synthesis. This pathway 
and a synthesis-dependent strand annealing pathway V 
(305), based on in vitro results with T4 proteins (114), 
have been proposed to explain homing of T4 introns 
(76, 299, 300) and the repair of double-strand breaks 
introduced by endonucleases that resemble intron nucle¬ 
ases (28). This repair also uses multiple pathways (24, 
300, 320). 

The in vitro recombination-dependent replication reac¬ 
tions (114, 213, 214) are primed by 3' ends of ssDNA. With 
an excess of UvsX protein, and in the absence of gp uvsX, 
primase or topoisomerase, they result in a conservative 
DNA replication called “bubble-migration” in which the 
nascent DNA strand is extruded from the DNA template 
much like nascent transcripts are extruded by RNA 
polymerase (114). There is no direct evidence that bubble 
migration occurs in vivo at physiological concentrations of 
UvsX protein and in the presence of topoisomerase (218). 
Surprisingly, the T4 proteins gp46 and gp47, whose genes 
are homologous to the eukaryotic Mrell-Rad50 complex 
(79, 80) and yield the most recombination defective muta¬ 
tions (35, 37, 38), are not required in the current in vitro 
recombination systems. 
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Ultimately, reiteration of recombination-dependent 
DNA replication by these different pathways generates a 
highly branched concatemeric DNA network in which no 
individual chromosomes can be distinguished (172). The 
branches can be resolved either by endo VIL (gp49) or the 
largest endonuclease subunit of terminase (gpl7). Endo¬ 
nuclease VII (gp49) cuts Holliday and 3-way recombina- 
tional junctions as well as heteroduplex mismatches and 
other DNA distortions in vivo and in vitro (43, 44, 46, 112, 
123, 137, 141, 198-200, 204, 295, 296, 328, 367, 378). Gpl7 
of terminase can cut single-stranded regions in super- 
coiled DNA (41, 42) and at junctions between ssDNA and 
dsDNA (118). The endonuclease activities of T4 terminase, 
the enzyme that generates chromosomal ends during 
packaging, reside in the subunits encoded by gene 17 (41, 
42, 223, 234, 242, 250). At least four proteins are initiated 
from different in-frame initiation codons. Only the largest 
of these has a ssDNA binding domain that allows it to 
bind to and cut preferentially the junctions of ssDNA and 
dsDNA (118). It is surmised that only the largest gpl7 is 
also involved in join-cut-copy recombination-dependent 
DNA replication (290). In any case, because only the join- 
cut-copy mode can bypass the requirement for primase or 
topoisomerase in T4 DNA replication, double and triple 
mutants defective in these enzymes are defective in the 
bypass replication (290, 293). 

Extensive heteroduplex regions containing mismatched 
or looped-out bases are formed during recombination 
in vivo (295, 296) and in vitro (45, 46, 198). Their repair is 
mediated by endo VII, the enzyme that cuts Holliday 
junctions (198, 200, 296), not by rnutSHL or its homologs 
(97). Interestingly, this repair appears to occur during 
packaging, when endoVII is associated with the portal 
protein, gp20 (138). 

Recombination frequencies decline rapidly when the 
recombining molecules share less than 50 bp of homology 
(19, 94, 136, 371). This decline is probably not due to 
inability to pair, because in vivo and in vitro heteroduplexes 
containing extensive mismatches are formed (45, 295, 296) 
and the UvsX-related E. coli RecA protein requires only a 
few homologous base pairs to initiate pairing (170). Instead, 
recombination seems to be reduced by an enhanced 
probability of cutting by endo VII during branch migration 
through partially heterologous regions and repair of 
mismatched heteroduplex regions (see below) (46, 123, 141, 
198,200,204,290,295,296,366, 367, 378,441). 

DNA Repair 

As in other organisms, damage and mismatches in T4 
DNA can be repaired in several different ways, and repair 
defects result in increased sensitivities to such damage 
and/or to differences in mutation rates. Recombination- 
dependent DNA replication is an important component of 
repairing DNA damage and of restoring broken or stalled 


replication forks (37, 38, 82,131, 205, 213, 214, 217, 219, 271, 
280, 387). There is, however, a difference between recom¬ 
bination patterns at “natural” chromosomal ends and sec¬ 
ondary ends generated by endonucleases. All markers near 
natural chromosomal ends appear in progeny particles 
(276, 289). In contrast, some genetic markers near nuclease¬ 
generated ends are degraded in both DNA strands from 
the broken chromosome and are replaced by alleles from 
the unbroken chromosome, resulting in apparent gene 
conversion (24,128, 365, 386). We surmise that T4 gp2 (370) 
protects natural, but not secondary, chromosomal ends 
from degradation. 

In addition to the roles of homologous recombination 
pathways in DNA repair, discussed in the preceding section, 
there are two nonrecombinational repair pathways. The first 
mechanism shown to repair UV-damaged T-even DNA was 
photoreactivation (103). It is now known that the host 
enzyme responsible for this repair reverts exclusively pyri¬ 
midine dimers to the monomeric state (reviewed by 96,97). 

The second mechanism was first revealed by Harm 
(149), who isolated DNA repair mutants in two T4 genes 
that are now called denV and uvsX. These mutations defined 
excision repair and recombinational repair as major T4 
pathways. These pathways are physiologically connected, 
because the ssDNA binding protein, gp32, participates in 
both excision repair and recombinational repair (271) and 
because the intermediates of excision repair are thought 
to generate substrates for recombination, for example by 
stalling or disrupting replication forks (245). 

The product of denV endonuclease V is the prototype of 
a base excision repair protein. It has both N-glycosylase 
and abasic lyase activities (90, 245), which incise the DNA 
(forming a covalently linked protein-DNA intermediate). 
These activities remove the pyrimidine dimers to allow 
resynthesis, notably including E. coli DNA polymerase 1 
(392). Profound differences in UV sensitivities of T4 
and T2 are due to the presence of denV in T4 but not in 
T2 (90,149). 

The precise role of the gp46/gp47 complex in DNA repair 
is still as elusive as it is in recombination. A suppressor 
of gene 46/47 defects, das (157), suppresses the DNA repair 
deficiency but not the DNA synthesis arrest phenotype 
of T4 uvsX mutations (417). Das mutations change the 
sequence of T4 RNaseH without inactivating this enzyme 
(G. Mosig and N. Colowick, unpublished data), suggesting 
that they allow RNase H to substitute for defective nuclease 
activities of the 46/47 complex. 

DNA Packaging 

The branched, concatemeric vegetative T4 DNA generated 
by recombination-dependent DNA replication is cut by 
terminase, a heteromeric protein encoded by genes 16 and 
17 (331, 333). This protein complex binds to the vegetative 
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DNA and subsequently connects with gp20, the portal 
vertex protein of the head, to form a “packasome” that, 
fueled by ATP hydrolysis, drives DNA into preformed heads 
(50, 241, 243). Gpl6 binds to dsDNA. The endonuclease 
activity and a weak ATPase activity of T4 terminase reside 
in gpl7 (41, 42, 119, 234, 332). Gene 17 produces several 
proteins of different sizes by initiation from in-frame 
internal initiation codons (119). These internal initiation 
sites are preferentially used in transcripts initiated from an 
internal promoter and in transcripts processed by RNase 111 
(119). As predicted from the similarity of its N-terminal 
domain with ssDNA binding proteins, the largest of these, 
and only the largest, binds to ssDNA segments and cuts 
them preferentially at junctions with dsDNA (118). Presum¬ 
ably, this facilitates initiation of packaging from recombina- 
tional junctions. The shorter products of gene 17, lacking the 
ssDNA binding domain, suffice for packaging of precut 
mature DNA in vitro (138). 

Recombinational branches can be cut by T4 endo¬ 
nuclease VII (gp49), discussed above. Alternatively they are 
resolved by branch migration or by DNA synthesis (296), 
explaining why gene 49 mutations are not completely 
defective in recombination. DNA ligase, endonuclease V 
and topoisomerase are also required, presumably because 
only uninterrupted dsDNA can be wheeled into heads 
by the packasome (48). Because, as mentioned above, 
endonuclease VII and gpl7 also play important roles in late, 
join-cut-copy recombination-dependent DNA replication, 
these activities link DNA replication and packaging at a 
physiological level. 

The initiation of DNA packaging is a matter of debate, 
suggesting that this process, like other DNA transactions, 
can be initiated by different redundant mechanisms. A mech¬ 
anism based on synapsis of two homologous sequences 
in genes 16 and 19 and on recognition of this sequence as a 
pac site by gpl6 (243) has been proposed (48). This model is 
based on preferential amplification of a DNA segment 
encompassing gene 27 between the two short homologous 
sequences (437, 438), on preferential packaging of foreign 
DNA containing this sequence, and on the apparent 
overabundance of the corresponding restriction fragment 
in packaged T4 DNA. 

On the other hand, comparisons of maps based on cutting 
of chromosomal ends versus recombination showed no 
distortion in the gene 16 to gene 19 region, and only a slight 
distortion near gene 37 (273, 278, 289) (figure 18-2). Because 
the largest peptide made by gene 17 binds to ssDNA, and not 
to dsDNA, and cuts DNA preferentially at ssDNA-dsDNA 
junctions (118), we have proposed that packaging is prefer¬ 
entially initiated at recombination junctions, which occur 
at random positions of the genome. 

Probably both mechanisms can initiate packaging, and 
the different results reflect differences in the DNA substrates. 
In contrast to the experiments of Mosig (273, 278, 289), who 
investigated wild-type phages containing modified HMC 


DNA, the substrate for the experiments of Lin and Black 
(241) was unmodified dC-containing DNA since modified 
T4 DNA would not be cut by restriction enzymes. As 
mentioned above, the gene 17 region also contains a replica¬ 
tion origin that is preferentially used only in dC-containing 
DNA (439). Processive packaging of 103-105% genome 
lengths to fill the preformed heads would randomize the 
ends in either case. 

We consider the possibility that the basic terminase of a 
T-even ancestor was composed of gpl6 and gpl7', recog¬ 
nizing a preferred pac site. Later acquisition of a ssDNA bind¬ 
ing domain allowed the T4 terminase to better adapt 
packaging to the recombination-dominated life-style, 
which is best served by random permutation of its chromo¬ 
somal ends. 

Virion Structure and Assembly 

It is now clear that in all phage assembly systems studied, 
obligatory pathways of morphogenesis are determined at 
the level of protein-protein interactions and not at the level 
of sequential transcription of the structural genes (107). 
However, the complex regulation of T4 transcription and 
translation, discussed above, modulates the final levels of 
structural gene products late in infection. These regulated 
levels, in turn, are important in controlling successful 
assembly of the phage structures (108). The major com¬ 
ponents of T4 virions, (heads, tails, and long tail fibers) 
are assembled via independent pathways in vivo. These 
substructures can be isolated and combined in vitro to 
yield infectious particles (105,108). 

Advances in cryo-electron microscopy and X-ray dif¬ 
fraction now provide structural data on bacteriophages 
from below 2 nm to atomic resolution (for reviews see 16, 
344, 345). Recent cryo-electron microscopic studies of 
T4 isometric heads (176, 311) confirm and extend earlier 
head-structure data, and combined crystallographic, and 
data on T4 tails (113,187) provide structural details that are 
relevant to the understanding of T4 tail function. 

Head Structure and Assembly 

The head assembly pathway is summarized in figure 18-6. 
The completed T4 head is composed primarily of the major 
cleaved capsid protein, gp23*, arranged in an icosahedral 
surface lattice of T = 13, elongated along a 5-fold axis (267). 
The length of theT4 head is defined by a number, Q, related 
to the icosahedral T number. An illustration of Q numbers is 
shown in figure 18-7 together with models of someT4 head 
lengths. The 5-fold vertices, except for one, are occupied 
by the cleaved vertex protein (gp24*), whose sequence 
is related to gp23, probably reflecting gene duplication. 
The 5-fold vertices are likely replaced by gp23* in certain 
gene 23 missense mutants that bypass the requirement 
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Figure 18-6 Assembly and maturation of the T4 prohead. Locations of genes that control these processes are shown on the 
circular genome map. Tail genes interspersed among the head genes are indicated on the inside of the circle. The figure is a 
modification of that in Black and Showe (49). 


for gp24* (256). Those mutations in gene 23 that bypass 
the 24 requirement can also affect head length (152). 

T4 head precursors (proheads) are assembled around 
a core or scaffold, consisting mainly of gp22 and gp21, and 
also containing gp67, gp68, and gp (inh = lip). After pro¬ 
head assembly is complete, the scaffold proteins are cleaved 
to short peptides, followed by expansion and stabilization 
of the capsid shell (50) by DNA (177,178). 

The vertex from which assembly is initiated (at the bac¬ 
terial membrane) contains the portal protein gp20. The 
portal consists of a ring of 12 copies of gp20. Its role in 
packaging and ejection of DNA is discussed in chapter 6. 
In all dsDNA tailed bacteriophages studied so far, the portal 
protein is of similar construction, and in the Bacillus subtilis 
bacteriophage <j)29 (chapter 22) it has been shown to be a 
DNA translocating machine engaged in moving DNA into 
the head by a rotation mechanism (269, 373). 

Functional T4 heads of altered length can be assembled 
that contain DNA shorter or longer than the normal 


complement (50, 231, 232, 276, 278, 289). Electron micro¬ 
scopy shows that there are three main head length 
classes, whereas widths are usually the same. Normal, 
elongated (giant), and isometric T4 heads are all 85 nm 
wide (104). The T4 variant containing 0.68 normal DNA 
length appears structurally isometric (T = 13, 0 = 13). 
The head length classes fit Q numbers of 13 for isometric 
(0.68 DNA length) phages, 17 and 18 (0.85 and 0.90 DNA 
lengths) for intermediates, and 21 for normal DNA length 
phages (illustrated in figure 18-7). Fractional DNA lengths, 
measured by sucrose gradient sedimentation and by 
genetic methods (276, 289), or by gel electrophoresis 
(231, 232), fit the model that these DNA molecules fill the 
head volume. These shorter and longer head lengths are 
related to the basic mechanism determining T4 head 
length (231, 232, 267). The surface lattice design was deter¬ 
mined directly from cryo-electron micrographs of iso¬ 
metric heads (176, 311) following earlier studies on phage 
T2 (54). 
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Figure 18-7 Surface lattice arrangements predicted for 
isometric, intermediate, and normal-length T4 heads. 

A: Representations of normal, intermediate, and isometric 
heads. The definitions of the lattice parameters are shown 
below. For the isometric phage, the surface lattice is defined 
completely by the triangulation number (T), given as 
T(m,n) = m 2 + mn + n 2 ; in the drawing m = 3, n = 1, and 
T=13. For an icosahedron elongated along a 5-fold axis, 
another number, Q, is used to define the extent 
of elongation. B: The Q number can be any integer greater 
than T and is defined by the coordinates m' and n'. For 
the intermediate phage shown, m' = 3, n' = 2, and Q = 17. 
The normal-length head shown has m' = 3, n' = 3, and 
Q = 21; data from the laboratory of M. Rossmann et al. 
show that Q = 20 for their sample of T4 (113). 

Head length is determined at a very early stage, prior to 
the formation of the unprocessed proheads. It is regulated 
primarily by protein-protein interactions between the 
major head components described below. Mutations in 
the genes for the major head protein (gene 23), portal 
protein (gene 20), scaffold protein (gene 22), and vertex 
protein (gene 24) can modulate head length (92, 93, 152, 
353). Single amino acid changes in the gene 23 protein 
result in specifically altered head lengths. These mutations 
are located at 12 sites clustered in three locations in gene 
23. The total span of the three clusters is 155 bp, coding 
for approximately 52 amino acids (out of 521 amino acids 
in gp23). These sequences may represent segments of 
gp23* that interact with the other proteins that regulate 
head length, or with other gp23* subunits. Head length 
is also affected by variations in the relative intracellular 
concentrations of the head proteins (108), explaining virion 
size and density variations during the latent period (272). 

Several models for regulation of head length (50) have 
been proposed (32, 33, 70, 195, 267, 322, 368). Of these, two 
models are currently favored, but both require experimental 
support and the real mechanism may combine aspects of 


both models. The Vernier model (33, 322) proposes that 
length is measured internally, like a mechanical vernier 
used in measuring devices. A possible example is two linear 
structures made of protein subunits of differing size. During 
copolymerization, the ends of the structures would only be 
in register after a certain number of subunits have been 
added, and polymerization would stop only when they are 
in register. Such a model was proposed as an explanation 
for the regulation of T4 head length—based on differing 
structural repeats between the elongating scaffold or core 
protein and the elongating coat protein—and has been 
refined more recently (33). 

The cumulated strain or intrinsic curvature model 
proposes that as protein subunits assemble they are 
progressively distorted as a result of their bonding interac¬ 
tions which are due to the intrinsic curvature of the pro¬ 
teins. Ultimately, this distortion results in a switch of 
the entire structure into a lower energy conformation, 
where no further subunit addition is possible (195), or alter¬ 
natively, the structure is directly regulated by the subunit 
curvature itself. Moody (267) has refined these arguments 
and defined two kinds of curvature, mean and Gaussian, 
that can account for the closed-shell assembly of spherical 
viruses. Clearly, in elongated T4 heads, additional size 
information must come from other sources, in this case the 
scaffold protein, since elongated heads have elongated scaf¬ 
folds. In addition, Doherty’s suppressor analyses (92, 93), 
mentioned above, support the influence of additional factors 
in head length determination. Moody’s analysis is, however, 
an important theoretical contribution. 

The observation that single amino acid changes in one 
head gene can change the head length distribution has 
important consequences for the interpretation of head 
lengths in T4-related phages. Phages that are related to T4 
by sequence but which grow in different hosts and environ¬ 
ments, may have isometric heads, like the marine cyano- 
bacteriophage S-PM2 (148), or have longer heads than 
T4, like the vibriophage KVP40 (260). This implies that 
evolution has selected for plasticity, not rigidity, in bacterio¬ 
phage capsid designs. 

The T4 proheads expand during maturation and exten¬ 
sive protein structural arrangements take place. These 
result in large-scale movements of polypeptide chains from 
the surface to the inside and vice versa, as well as major 
shifts in inter-subunit connections (207). The expansion 
follows the initiation of DNA packaging (177,178) and may 
"lock” the capsid into a stable, “toughened” structure. The 
early prohead lacks two accessory proteins of the complete 
T4 head, called Hoc and Soc, which are added after the 
head is expanded (50). Soc confers additional capsid stability, 
especially at low pH (175, 342). The Hoc protein lies at the 
center of a six-membered ring (capsomer) of gp23*. and 
there are six copies of Soc around each capsomer, except 
around the gp24* vertices (figure 18-3). The Hoc protein 
forms a mushroom-like projection that protrudes from 
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the center of each capsomer (311) (figure 18-3). Osmotic 
shock resistance is influenced by the gp24* vertex protein 
(235, 303). 

Studies of in vitro assembly of theT4 major head protein 
have so far provided limited insight into length regulation, 
but mutant studies have revealed important aspects of 
assembly in vivo. T4 mutants defective in head-assembly 
genes 20, 22, and 40 accumulate large numbers of tubular 
assemblies of uncleaved gp23 called “polyheads.” Mutants 
in genes 20 and 40 are defective in prohead initiation, and 
infected cells accumulate large quantities of unassembled 
gp23, which assemble into tubular polyheads late after 
infection (50). In gene 22 mutant-infected cells that express 
mutant gp2 2 internal scaffolding protein, polyheads 
are mostly of the same diameter, although the cylindrical 
pitch angles vary from tube to tube (267). However, in gene 
22 mutants that completely lack the scaffold (core) protein, 
multilayered cylinders form around those tubes that assem¬ 
ble spontaneously, giving rise to tubes of different diameters. 

In vitro, uncleaved monomeric gp23 protein (derived 
from dissociated polyheads) can be reversibly assembled 
into hexamers and polyheads, similar to assembly in other 
entropy-driven assembly systems such as flagella, micro¬ 
tubules, and tobacco mosaic virus (50). The information 
needed to form tubes is contained in the primary amino 
acid sequence of the head protein under favorable envi¬ 
ronmental conditions. Information for the assembly of 
hexamers, and for determination of the shell diameter, is 
also contained within the gp23 protein sequence. 

Six trimeric “whiskers,” each 53 nm long, are attached 
to the lower of the two knobs protruding from the portal 
head vertex (figure 18-3). The whiskers are made of a-helical, 
coiled-coil trimers of fibritin, the 486 amino acid product 
of the wac. gene (394). The extended C-terminal (3-annulus 
region initiates and stabilizes assembly of the trimeric 
whisker, and the N-terminal end binds to the neck of the 
head (52). These whiskers promote long tail fiber attach¬ 
ment, extension, and retraction (78, 435, 436), and serve as 
part of the environmental sensing system of the phage that 
maintains the long fibers in a retracted and protected config¬ 
uration until needed. Figure 18-6 summarizes the assembly 
pathway of theT4 phage head and shows the map positions 
of the genes coding for capsid components. 

Tail Structure and Assembly 

Tails of Myoviridae, such as T4, have a remarkably constant 
structural design. Each tail has an intricate baseplate 
attached to a two-layered tube: an inner tubular part 
assembled from identical subunits of a roughly 20kDa 
protein, arranged in 4.0 nm stacked disks, surrounded by 
a tubular contractile sheath assembled from a roughly 
60 kDa protein with the same periodicity (53). In addition, 
tail fibers are usually attached to the baseplate. The base¬ 
plate and tail fibers are composed of many different parts 


that undergo coordinated conformational changes during 
infection, resulting in cell-surface binding, sheath contrac¬ 
tion, membrane penetration, and DNA ejection. 

T4 tail assembly is initiated by assembly of the complex 
baseplate (figure 18-3). T4 baseplates consist of a central 
component (“hub”), six outer “wedges,” and six tail 
“pins” (81). Hub and wedge components assemble via 
independent pathways that subsequently converge to form 
complete baseplates. 

The central hub of the baseplate remains poorly defined. 
Baseplate assembly initiates with formation of a multi¬ 
protein hub complex, which subsequently undergoes signi¬ 
ficant rearrangements before completion of assembly. The 
composition of the initial hub complex is unclear, but it 
at least contains gp29, which later adopts an extended form 
within the tail tube, and a complex of gp5 and gp27, which 
is used to penetrate the cell envelope during infection (187). 
(In addition, gp51, gp26, and gp28 may also be involved 
as assembly catalysts or structural parts.) The gp5/gp27 
complex forms a membrane-puncturing needle that has 
lysozyme activity. When gp 5 is incorporated into the base¬ 
plate, it is proteolytically cleaved (at Ser351-Ala352), and 
both cleavage parts remain in the baseplate. This cleavage 
is required to activate the lysozyme, which is used to break 
down the cellular peptidoglycan layer at the time of infec¬ 
tion (189, 190, 292, 304). Near the C-terminal end of gp5 
is an 11 nm long, 2.8 nm wide three-stranded (3-helix that 
forms the needle, while the central lysozyme domain of gp5 
lies around the baseplate-proximal part of the needle (187). 
A trimer of gp27 forms a hollow cylinder into which the 
N-terminal antiparallel (3-barrel domain of gp5 is inserted, 
forming a thin, rigid device that facilitates penetration of 
the tail tube and viral DNA into the bacterial cell. 

The six outer “wedges” form the bulk of the baseplate. 
Assembly initiates with the outer-edge proteins, and pro¬ 
gresses toward the center. The earliest step in wedge assem¬ 
bly is the interaction of the tail pin proteins, gplO and gpll, 
with gp7 and gp8. Other proteins that complete the wedge 
are added in the order gp6, gp53, and gp25 (81). Crowther 
et al. (84) determined the architecture of the hexagonal and 
star-shaped baseplates to about 3.0 nm resolution. From 
mutational analyses they deduced that three of the proteins 
studied (gp9, gpll, and gpl2), account for 40% of the total 
mass of the baseplate, and they can all be added to the hexa¬ 
gonal structure after it is completed. Gp9 was determined to 
be the site of tail fiber binding since gp9 is needed for tail 
fiber attachment, is located near the site where tail fibers 
join to the baseplate, and antiserum directed at gp9 deter¬ 
minants blocks fiber attachment. Gpll makes up the distal 
portion of the tail pin, and also supplies the gpl2 binding 
site. Gpll has been found to be immunologically related 
to gplO (36) and may have evolved from it by gene duplica¬ 
tion. The structure of gpll has been determined to atomic 
resolution (236). Gpl2 forms the six 35 nm short tail fibers, 
which implement irreversible binding to the host during 
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infection (197). In wild-type phage and isolated tails, these 
fibers occupy a stored or folded position (187, 249). Six 
complete wedges assemble around a hub complex, and base¬ 
plate assembly is completed by addition of gp48 and gp54, 
providing the substrate for tail tube polymerization. 

The inner tail tube is composed of 144 copies of a single 
protein, gpl9, which are arranged in 24 stacked rings spaced 
4.1 nm apart (81). The tube has an average diameter of 9.0 nm 
and an inner hole of 3.5 nm (268). The central hole in T4 
tube baseplates (sheathless tails) is occupied by mass that 
is absent in tube baseplates partially emptied by treatment 
with guanidine HC1. This mass probably represents the 
gp29 “tape measure” protein that determines tail length 
(81, 101), similar to the gene H protein of phage X. During 
assembly several copies of this protein present in the 
baseplate hub are thought to extend inside the growing tail 
tube and act as a template to limit tail length during tube 
assembly. It must also move out of the tube ahead of the 
DNA during ejection, demonstrating a remarkable series of 
activities for a single protein. 

The contractile sheath, formed from 144 gpl8 tail sheath 
subunits, is arranged nearly identically to gpl9 subunits 
in the tail tube. All gpl8 subunits are apparently in close 
contact with gpl9 subunits of the tail tube since assembly 
of the sheath in an extended form is dependent on polymeri¬ 
zation of the tail tube. The ring of gpl8 nearest the head 
binds to the tail terminator, made of gpl5 and gp3. Gpl8 
subunits in each ring are rotated by about 17° to the right 
with respect to those below, giving rise to the prominent 
right-handed helix seen on the surface (figure 18-3). 
Amos and Klug (11) portray each gpl8 in the extended 
sheath as sloping downward from inner to outer radius, 
and this is also visible in the reconstructions of frozen 
hydrated tails (237). There is less downward slope in the 
model of Smith and Aebi (374). The sheath contracts during 
infection, due to a transition in which the axial repeat 
decreases from 4.1 nm to 1.5 nm and the twist angle changes 
from 17° to 32°. The dramatic change in the shape of 
the sheath upon contraction was shown to be related 
to relatively small changes in the overall conformation of 
gpl8 (11,196). 

Long and Short Tail Fiber Structure 

and Assembly 

The six long tail fibers of T4 are oriented for assembly onto 
the baseplate by the fibritin (wac) “whiskers” (436). Some 
other structure must serve this function in T2, since it lacks 
whiskers. Fibers about 3.0-4.0 nm thick and 150 nm long 
are made by the joining of two half-fibers at a kinked angle 
of about 150°. The half-fiber bound to the baseplate is 
constructed from three molecules of gp 34 (71), and much of 
the fiber is likely composed of a three-stranded, (3-helical 
structure. The half-fiber that binds to the cell surface is 
more complex, and is made of gp35 and three copies each 


of gp36 and gp37. The thin tip of gp37 contacts the cell 
surface with the C-terminal end (21). 

From combined X-ray diffraction and electron micro¬ 
scopic studies, it has been proposed that the distal half-fiber 
is composed of a set of globular domains at specific regions 
along the half-fiber and that many of the polypeptide chains 
are in a cross-(3 conformation with face-to-face packing of 
both gp36 and gp37. 

T4 fiber assembly requires the catalytic function of gp38 
(436). However, in phage T2 and several T-even-like phages, 
a non-homologous gp38 participates with gp37 in host 
range determination and receptor recognition (156). This 
protein functions as an adhesin, or cell-surface-binding 
protein, similar to those found in the pili of bacterial 
pathogens. 

Throughout the assembly process, host cell functions 
are required for proper construction of many parts of 
the virus. The GroEL protein chaperone and phage gp31 
co-chaperonin are needed for head assembly. Tail fiber 
assembly also depends on the interaction of viral gp57A 
with the fibers, but host mutants of E. coli have been isolated 
that do not require gene 57A function (436). 

The short tail fibers (gpl2) at the bottom of the baseplate 
have a 24 nm shaft, 3.8 nm in diameter at the N-terminus, 
and a C-terminal globular domain (249). The short tail 
fibers have two distinct conformations. As part of the hexa¬ 
gonal baseplate they exist in a compact, “stored” position 
with a bend near the center. When the hexagonal baseplate 
expands to the star-shaped conformation, the fibers extend 
to full length. The crystal structure of a heat- and protease- 
stable part of the fiber (residues 85-396 and associated resi¬ 
dues 518-527) shows an N-terminal elongated domain, 
a striking right-hand triple |3-helix domain (290-327) and 
a C-terminal, more globular domain (408,409). 

The short fibers attach to baseplate protein gpll, itself 
a trimer, each monomer of which has 218 residues folded 
into three domains (236). Both in the assembly pathway 
and in the structure, trimeric gpll binds to trimeric gplO. 
The gplO trimer binds to both trimeric gp9 and gpll. The 
C-terminus of gpll lies at the gplO-gpll interface, while the 
N-terminus binds the short fibers. Likewise, the C-terminus 
of gp9 binds to gpll, and the N-terminus of trimeric gp9 
binds the trimeric long half-fiber gp34. Thus, when the 
long fiber binds reversibly to the cell surface, a signal 
would be transmitted via gp9 through gplO and gpll to 
the short fiber gpl2, activating it for tight binding to the 
cell (236). 

Assembly of both the long and short tail fibers requires 
participation of the viral-coded chaperone gp57A; in 
its absence gp34, gp37, and gpl2 fail to trimerize and 
form intracellular, insoluble aggregates (60, 61). Unlike the 
GroEL/ES or the GroEL/gp31 chaperonins, which are rela¬ 
tively nonspecific regarding substrate proteins, gp57A 
appears to act more specifically by blocking aggregation 
of folding intermediates of gpl2, gp34, and gp37, each 
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of which contains complex three-stranded (f-helices that 
require chaperone action. 

Evolution 

As mentioned earlier, the “essential’genes of the most closely 
related T-even phages are arranged in the same order, and 
their sequences are similar, but the genomes of different 
members of the family have different “nonessential" genes 
interspersed between these essential genes. These differ¬ 
ences first became evident as insertion or substitution loops 
in electron micrographs of heteroduplex DNA prepared 
in vitro by annealing single strands of T2, T4, and T6 DNA 
(201) (figure 18-2), and have been confirmed by sequence 
comparisons in many cases (128, 226, 290, 396, 397). 
Although inactivating the nonessential genes has no con¬ 
sequences for phage growth in the laboratory, some are 
suspected to be important under different physiological 
conditions, and in the face of various restriction systems 
imposed by different hosts. The more distantly related 
T-even phages (2) have also rearranged the order of their 
essential genes. 

Based on similarity or divergence of tail and head genes, 
theT4-related phages have been classified as T-even, pseudo- 
T-even and schizo-T-even phages (397). However, such classi¬ 
fications are arbitrary for any phages because horizontal 
gene transfer has resulted in extensive mosaics of phage 
genomes (233). For example, T4 and T6, which are closely 
related in the tail genes, differ in their dCTPase genes as 
much as the pseudo-T-even phages differ in their tail genes 
(S. Lousteau and G. Mosig, unpublished data). Although it 
has been postulated that theT4-related phages have experi¬ 
enced few exchanges with other phage groups (226, 396, 
398), there is now ample evidence to the contrary (153, 233; 
chapter 21). 

The view that “all the worlds a phage” (155) is supported 
by patchy similarities of tail fiber genes of many phages 
(130,145,154,156, 259, 339). Moreover, theT4 endonuclease 
VIL (gp49) (18, 200), which cleaves recombinational inter¬ 
mediates, resembles an endonuclease of Mycobacterium 
phages L5 and D29 (155). The T4 DNA ligase (gp30) 
resembles theT7 DNA ligase (15). The sigma factor for late 
T4 transcription, gp55, resembles other sigma factors 
(140, 142). Gpl7, part of the ATP-dependent terminase and 
DNA packaging complex, has regions of homology with 
ATP binding packaging proteins of other phages and of 
herpesviruses. The T4 lysozymes (gp e and gp5) resemble 
lysozymes of other phages (127, 292, 423). This list will 
undoubtedly become larger as more complete genomes 
are sequenced. 

Exchanges between different phage genomes were 
first experimentally analyzed among the lambdoid phages, 
leading to the conclusion that entire modules of related 
genes had to be exchanged together (64). In retrospect, the 


strict concept of modules may be imposed on phages with 
site-specific recombination systems and few control regions, 
such as promoters. In contrast, the T4-related phages have 
a much higher potential for homologous recombination, 
and they have many independent promoters and over¬ 
lapping transcription units. This allows lateral transfer via 
ectopic homologous recombination of much smaller units, 
that is, genes and gene segments. In fact, horizontal trans¬ 
fer of different genes to a site adjacent to the dCTPase gene 
in different T-even phages is apparently responsible for the 
large differences found between the dCTPase genes of dif¬ 
ferent T-even phages (285, 290). We have proposed that 
this mechanism involves pairing of partially homologous 
sequences, join-cut-copy recombination and heteroduplex 
repair (figure 18-8), and that it generates multiple mutations 
from a single horizontal transfer. This mechanism is proba¬ 
bly especially prevalent among the T4-related phages 
because of their high recombination potential. It can explain 
the differential variability of other phage genes as well, and 
throws doubt on interpretations of cladistic analyses. 

For example, substitutions of foreign sequence into the 
tail fiber genes can account for the differences in tail fibers 
and host range of among various members of the family 
and for the remarkably rapid coevolution of viral and 
host genomes (156). The T4 terminase subunit gpl7 has 
at its N-terminal segment a ssDNA binding domain that is 
lacking in terminases of other phages. However, shorter 
proteins can be synthesized from transcripts initiated at 
an internal promoter (119, 118). We have discussed the 
evolutionary relevance under “DNA Packaging.” The lyso¬ 
zyme domain of T4 gp5 (discussed in the “Tail Structure 
and Assembly”) is missing in gp 5 of the related Vibrio phage 
KVP40 (E. Miller, personal communication). Whereas such 
differences could also be explained by deletions of gene 
segments, the DNA sequence divergence in adjacent regions 
suggests that these domains have been acquired by lateral 
transfer via a mechanism depicted in figure 18-8. In turn, 
the heterologies contribute to the species barriers between 
different members of the family by reducing production of 
viable recombinants. 

Remarkably, T4, but so far no other member of this 
family, encodes several related endonucleases, either embed¬ 
ded into introns and responsible for homing of introns 
(24, 26, 27), or free-standing: segA-F (28, 363) and mobA- 
mobF (262, 400). These endonucleases can contribute to 
T4’s ability to exclude alleles of T4-like phages, by making 
specific cuts in T4-like phages but not in T4 (28, 290). 
However, other nucleases must contribute to the mutual 
exclusion of T4-like phages since exclusion also occurs 
between phages that do not contain any of the endo¬ 
nucleases mentioned above (G. Mosig, unpublished data). 

DNA of T4 and its relatives has a much higher A-T content 
than E. coli DNA, and there are different opinions as to how 
this difference is maintained. A comparison of the homo¬ 
logous T4 and E. coli Dam methylase genes (151) suggests 
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Figure 18-8 Moron acquisition. A model to explain acquisition of a new gene or gene segment (moron) by an adaptation 
of join-cut-copy recombination (pathway III in figure 18-4). Very limited sequence homology between the foreign DNA 
and a resident gene A allows single-strand invasion by the fragment’s end. Branch migration into a heterologous region 
generates heteroduplexes with multiple loops and mismatches, which can be repaired by endo VII (198, 295, 296). 
Initiation of replication from an endonuclease cut in the invaded strand will copy the moron into the ancestral T-even 
genome, if a short homology at the other end of the foreign DNA allows a similar invasive recombination. From (285). 
See thebacteriophages.org/frames_0180.htm for a color version of this figure. 


that silent codon changes corresponding to T4 tRNA anti¬ 
codons are rapidly selected. For example, CUG is the most 
frequently used E. coli leucine codon, and 13 of the 26 
leucine codons are CUG. In contrast, none of the 29 T4 
leucine codons is CUG, in keeping with the fact that E. coli 
leucyl tRNA is rapidly cleaved after T4 infection. In most 
other genes the AT-rich codons corresponding to T4 tRNAs 
predominate. The notable exception is the highly expressed 
gene 23. This led Kunisawa et al. (224) to reject the above 
theory. 

In any case, the differences in A-T content make align¬ 
ments of T4 DNA sequences with homologous sequences 
of other organisms tenuous, but alignments of amino acid 
sequences can reveal evolutionary relationships. Both 


aerobic and anaerobic ribonucleotide reductases of T4 
closely resemble the E. coli ribonucleotide reductase (372). 
In some cases, structural analyses have revealed similarities 
not detectable at the sequence levels. The first examples 
wereT4 and goose lysozymes (254). 

It now appears that the DNA replication proteins of 
T4 are more closely related to the replication proteins of 
eukaryotic viruses and their hosts than to those of E. coli 
(116, 411). Similarly, some recombination genes of T4 share 
sequence similarities with corresponding proteins of all life 
forms: the UvsX protein with RecA-like proteins (106), the 
gp46-47 complex with the SbcBC and Mrell-Rad50 
complex (79, 80) and with the RecBC enzyme of E. coli (39). 
Based on such similarities it has been postulated that 
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transfer of these genes from DNA viruses has shaped the 
evolution of eukaryotes. 

Outlook 

As suggested throughout this chapter, many complexities 
and redundancies of T4 are perhaps best understood 
in terms of its evolutionary history. Exchange of genes 
and acquisition of genes from other genomes (by lateral 
transfer) has been the raw material for natural selection 
under different environmental conditions, ft is already well 
established that many genes are shared by rather different 
phages growing in different hosts and environments (155). 
Moreover, phages with contractile tail sheaths that resemble 
T4 have been found in many environments (148) (Robert 
V Miller, personal communication). They contain many 
essential genes that are conserved in sequence, and often 
in relative locations, interspersed with different nonessential 
or auxiliary genes of unknown function. Sequencing of their 
complete genomes is bound to contribute to understand¬ 
ing the evolution of phages specifically, and of evolution in 
general. It should also reveal to what extent site-specific 
recombination, transposition, and ectopic homologous 
recombination contributed to lateral gene transfer in 
different phage genomes. 

Two major lessons from T4, mentioned throughout 
this chapter, are: (i) The concept of protein machines, first 
developed to describe the exquisite structures formed 
during virion assembly, and their conformational changes 
during assembly and infection has been extended to the 
smaller, more fragile macromolecular assemblies that drive, 
among others, DNA replication, DNA precursor biosyn¬ 
thesis, recombination, repair, packaging and transcription. 
It remains to be seen to what extent the rules governing 
these protein nanomachines are universal for assemblies 
of defined stochiometry as compared with assemblies that 
change their functions by acquiring different parts, (ii) The 
processes that are driven by these protein machines are 
tightly interrelated, by physiological connections as well 
as by sharing of certain proteins. Thus, neither functioning 
nor evolution of these mechanisms and processes can be 
understood in isolation. 

DNA sequences and deduced protein sequences 
can now be used to evaluate possible general principles. 
Three aspects learned from such studies are as follows: 
(i) Although T4 proteins share certain similarities with 
proteins of similar function in other systems, in most cases 
the similarities are patchy, and they are not apparent at 
the DNA level, (ii) Some T4 proteins, seemingly unrelated, 
share patches of homology, even at the DNA level, (iii) 
Many T4 genes produce more than one in-frame protein, 
probably because multiple forms are important in the 
proper assembly and function of the corresponding protein 
assemblies. 


We have discussed elsewhere (277, 279) how the original 
strategy of T4 DNA replication together with the strategy of 
the transcriptional program provides a most powerful selec¬ 
tion for efficient recombination. Recombination between 
partially homologous sequences, in turn, can initiate lateral 
gene transfer, associated with multiple sequence changes 
(279, 285, 290). It can also generate gene fusions and even 
proteins with novel functions. The homologies of T-even 
and eukaryotic replication proteins allowed general conclu¬ 
sions from exquisite biochemical, biophysical, and genetic 
analyses of T-even proteins. Short similarity of some 
T4 sequences with sequences of its host, other phages, 
plasmids, and eukaryotic genomes probably provides the 
substrate for “mix and match" evolution. The sharing of 
amino acid sequences among replication, transcription, and 
packaging components, and recombination between the 
corresponding DNA, might facilitate the rapid coevolution 
of these components and their integration into the complex 
T4 system. 

Because the essential genes of T4 have been exquisitely 
characterized by many investigators, in terms of both regu¬ 
lation of their expression and functions of their products, 
the analysis of its annotated genome (262) will serve as a 
guide to interpretations of related genomes, which are 
now rapidly being sequenced (87) (J. Karam, personal com¬ 
munication; see http://phage.bioc.tulane.edu). The lessons 
learned fromT4 alert us to the dangers of strictly computa¬ 
tional methods, but encourage creative experimentation. 
In turn, these comparisons will undoubtedly refine our 
understanding of virus evolution specifically as well as of 
general evolutionary principles. 

One of the more general aspects of T4 biology is 
the remarkable redundancy of pathways and proteins 
for fundamental processes, such as DNA replication and 
recombination. The importance of redundant pathways 
for global evolution of novel developmental circuits has 
been discussed (203). To maintain redundant pathways 
during evolution, it is important that they are based on 
different principles and subject to different pressures. 
The persistence of mechanistically different redundant 
pathways in T4 can be readily rationalized because they 
are best adapted to different stages of transcription, gene 
expression in general, and DNA packaging during the 
course of phage development (279, 287). Therefore, these 
redundancies are ready to bypass certain lesions in other 
important genes, and they confer tolerance to changes that 
facilitate evolution. 
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Bacteriophage T5 

JON R. SAYERS 


S everal features of phage type 5 make this a very interest¬ 
ing and rather unusual lytic bacteriophage. Phage T 5 
does not encode an RNA polymerase, unlike theT7 phage, 
nor does it contain unusual DNA modifications such as the 
5-hydroxymethyl deoxycytidine present in phage T4. T5 
DNA carries some of the strongest known prokaryotic 
promoters but they are regulated in a strictly controlled 
temporal sequence. Perhaps the most unusual feature is 
that the phages DNA enters the host in a two-step transfer 
mechanism. Transfer of the viral genome from the virion to 
host pauses after injection of the first 10 kb of the 120 kbp 
linear genome, until the pre-early phage-encoded proteins 
have been produced. The DNA itself is unusual in that it 
contains a number of cryptic nicks in one strand of its 
double-stranded genome. The phage is able to destroy host 
cell DNA rapidly, yet can quickly inactivate the nuclease 
responsible for this degradation. 

The T5 group includes phages BF23, 29a, BG3, and PB 
(52). Of these only T5 and BF23 have been studied in any 
detail. However, in comparison with phages X, T4, and T7, 
relatively little work has been published on these fascinat¬ 
ing phages. This chapter will present a general overview of 
what is known about their life cycles and will concentrate 
particularly on advances made since the last major review 
of the group was published (56). 


Physical Properties and Attachment 
to the Host 

Bacteriophage T5 belongs to Bradleys morphological group 
B (12). A long, noncontractile tail of 200 nm by 12 nm (11) 
is attached to one of the vertices of the icosahedral head (2). 
Bacteriophage T 5 DNA consists of a linear double-stranded 
DNA genome approximately 121,300 bp long. The phage 
genome possesses terminal redundancies in the form of 
10,160 bp direct repeats (65, 66). These repeats carry the 
so-called pre-early genes (figure 19-1). An unusual feature 
of T 5 DNA is that one strand is nicked in several places (1). 
The role and origins of these nicks remain cryptic despite 


early reports of four nucleases able to introduce such inter¬ 
ruptions into the DNA backbone (69). However, the termini 
of several nicks have been analyzed and found to contain the 
consensus sequence-NPurine/5'pGCGCN-(63). 

Irreversible adsorption of the virion to the host follows 
once the product of theT5 oad gene (or hrs in BF23) binds to 
the cellular receptor (34). The protein encoded by oad, pb5, 
binds to the FhuA protein (32, 59). The orthologous gene in 
BF23, hrs, encodes a related but divergent receptor-binding 
protein (44, 58). Sequence analysis oftheTS and BF23 recep¬ 
tor-binding proteins reveals that they share some conserved 
motifs (58). To date this is the only major variation observed 
between T5 and BF23 sequence data. The BF23 pb5 protein 
binds to BtuB, the E. coli vitamin B 12 outer membrane 
transport protein (10). The T5 receptor, FhuA, is a multi¬ 
functional membrane protein involved in the uptake of 
ferrichrome-iron (34). 

Elegant work by Killman and coworkers has identified 
a short peptide sequence able to trigger complete release of 
phage T 5 DNA from the virion (43). This peptide sequence, 
APADKGHY, maps to a loop on the extracellular side of 
the |3-barrel membrane-protein FhuA, the T5 receptor. 
The structure of this monomeric membrane protein has 
been determined by Ferguson et al. (21) and is shown 
in figure 19-2. The protein has two domains: a (3-barrel 
composed of 22 antiparallel (3-strands and a structurally 
distinct “cork” domain. The cork, comprised of four-stranded 
|3-sheets and four short a-helices, is inside the (3-barrel 
and thus blocks the channel. Binding of T5 virions to FhuA 
triggers a conformational change thereby opening a channel 
large enough to allow an efflux of Fe(III)-ferrichrome and 
the passage of T5 DNA (51). 


First-Step Transfer of DNA 

Once irreversibly bound to FhuA, the phage DNA is trans¬ 
ferred into the host cell in a two-step process. The struc¬ 
ture of the FhuA protein has been determined (64). It 
consists of a (3-barrel formed by the C-terminal 556 amino 
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Figure 19-1 Diagram relating the phage T5 physical and genetic map. The double-stranded DNA duplex (black lines) consists 
of one continuous (upper) and one nicked strand. Pre-early genes are encoded on the terminal repeats (dark shaded). The 
early and late regions are shown in white and grey, respectively. Genes coded on the top and bottom strands are shown 
above and below the bars respectively. Black arrows and a diamond show the positions of viable deletions. 



Figure 19-2 The T5 receptor protein. A: Structure of the T5 receptor, FhuA, as determined by Ferguson viewed in the plane 
of the membrane (21). The (3-barrel traverses the outer membrane. Note the black loop that interacts with the T5 receptor 
binding protein pb5 (43). There is also a secondary interaction with a region within the (3-barrel. B: A “phage’s eye view” of 
the receptor and the cork domain which blocks the channel until the phage binds. 


acid residues. These residues form a channel or pore-like 
structure blocked with a “plug” composed of amino acid resi¬ 
dues 20-157. Presumably, once irreversibly bound to pb5, 
the FhuA or BtuB transmembrane proteins undergo some 


form of conformational change to allow injection of the 
nucleic acid. It is the left end of the genome which is trans¬ 
ferred into the host cell first (35, 36, 45, 74). This process has 
been studied in vitro using FhuA protein incorporated into 
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unilamellar vesicles made of natural phospholipids (46). 
Under these conditions the entire T5 genome is rapidly 
transferred into the vesicle, where the DNA remains in a 
densely packed form. This is in stark contrast to the situation 
in vivo where DNA is transferred in a two-step process. Upon 
irreversible binding to the cell surfaces FhuA or BtuB 
protein, the process of first-step transfer (FST) occurs 
during the first 2 minutes, in which approximately the first 
8% of the left-hand end of the linear duplex genome is 
injected into the host cell (48). There then follows a pause of 
about 5 minutes during which time T 5-specific messenger 
RNA (mRNA) is synthesized and translated into the pre- 
early proteins. These proteins, A1 and A2-3, must be 
produced in order for the remaining T5 DNA to enter the 
cell in a process known as second-step transfer (SST) (47). 
It has been shown that this process requires the presence 
of calcium ions at approximately 1 mM. Calcium is required 
to maintain polarization of the bacterial cytoplasmic 
membrane, supporting the notion that the pores opened 
during FST must be closed in order for the infection to 
proceed (7). If calcium is not present, few or no proteins are 
transcribed from the FST DNA and the infection aborts 
unless calcium is added back to the medium (8). 

Pre-Early Proteins 

Bacteriophage T 5 contains some of the strongest promoters 
yet characterized, with high affinity for the host cell RNA 
polymerase (27, 84). The FST DNA contains at least three 
promoters which drive the production of pre-early mRNA 
immediately upon entry (83). Biochemical studies suggest 
that up to 10 pre-early proteins are present in phage-infected 
cells but not all are essential, since a viable mutant of BF23, 
which carries a deletion (dell), fails to synthesize three 
or four pre-early protiens (55). The best characterized pre- 
early genes are Al, A2-3, and dmp. Expression of the pre- 
early genes has dire consequences for the host cell: synthesis 
of E. coli DNA, RNA, and proteins are all rapidly shut-off; 
most of the host cell DNA is degraded (49, 88); host enzymes 
such as EcoRI, recBC, DNA methylase, and uracil-DNA glyco- 
sylase are all inactivated (56). The identity of the nuclease 
responsible for the rapid destruction of host cell DNA is not 
yet clear, but Al mutants fail to degrade DNA, leading to 
speculation that this is the nuclease responsible. 

The Al protein has been localized to both inner and outer 
membranes. It has a molecular weight of approximately 
57 kDa, and partial sequencing reveals that the N-terminal 
region possesses a signal peptide leader sequence which is 
consistent with membrane association (86). Al appears to 
form multimers with itself and hetero-oligomers with the 
smaller A2-3 protein (4). The Al homo-oligomer is approxi¬ 
mately 244 kDa, while the hetero-oligomeric complex with 
the A2 polypeptide is larger, at around 364 kDa (4). Sequenc¬ 
ing of part of the BF23 pre-early region showed genes A2 


and A3 to be one gene, renamed A2-3, which encodes 
a protein of 13.8 kDa (70, 86). The BF23 and T5 A2-3 pro¬ 
teins share over 95% identity and show some similarity to 
a number of nucleotide binding proteins and a lipoprotein 
(23, 79). The A2-3 protein is implicated in altering the host 
cell membrane structure (78) and the purified protein is able 
to bind DNA in vitro (80). 

Another unusual feature of T 5 infection is the excretion 
of bases and nucleotides from the cell during the first few 
minutes of infection. Host cell DNA degradation products 
(deoxyribonucleoside 5' monophosphates, dNMPs) are con¬ 
verted to deoxyribonucleosides by the pre-early product of 
dmp, a 5' deoxyribonucleotidase, which are further broken 
down to bases and excreted by cellular enzymes (85). 
Presumably this excretion is required to avoid the potentially 
damaging build-up of dNMPs that would otherwise occur 
(61). The nucleotidase activity decays rapidly after SST has 
taken place, implying that an early gene is responsible for 
shutting down this enzyme (5). 

Pausing and Second-Step Transfer 

What causes the transfer of DNA to pause at the FST stage? 
It had been proposed that a complex secondary structure 
or physical discontinuity on the DNA might provide the 
injection stop signal (ISS) (75). Parts of the FST region, 
including the ISS, have been sequenced. This region does 
indeed contain sequences with a high potential for form¬ 
ing complex secondary structures, such as multiple direct 
repeats of up to 18 bp, inverted repeats and an 18 bp palin¬ 
drome. These repeats include three 31 bp repeat units 
contained on a 99 bp sequence that could form mutually 
exclusive stem-and-loop structures. Another pair of 21 bp 
repeats contain two sequences resembling DnaA protein¬ 
binding sites (37). However, given that exposure of T5 
bacteriophage to purified FhuA alone is enough to cause 
complete ejection of the entire genome from the head, it 
is clear that some other interactions must be involved to 
halt FST at the ISS. 

The presence of elaborate secondary structures and 
potential DnaA-binding sites at the ISS suggests the involve¬ 
ment of host DNA-binding proteins (37). In addition, one of 
the cryptic nicks occurs at the base of a 9 bp palindrome in 
this region. It seems likely that host cell molecules, most 
probably DNA-binding proteins, bind the ISS, halting trans¬ 
fer at FST. Given that the Al protein can form oligomers with 
A2-3, which in turn has DNA-binding activity, it seems 
likely that these large protein complexes bind to the elabo¬ 
rate structures potentially formed at the ISS. Perhaps these 
phage-encoded proteins act to initiate SST by displacing 
host proteins bound at the ISS, subsequently allowing SST 
to begin. Certainly mutants defective in A2-3 are unable to 
complete SST, but do degrade host cell DNA, and SST can 
only occur after synthesis of Al and A2-3 polypeptides (47). 
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Early Genes and Their Roles 

Many of the devastating effectors of pre-early gene expres¬ 
sion become inactivated before or shortly after SST takes 
place. The potent nuclease activity responsible for destruc¬ 
tion of the host’s DNA is neutralized and the strong pre- 
early promoters are silenced, probably due to interactions of 
the host cell RNA polymerase with the A1 protein and 
another pre-early gene product (53). The early region also 
contains an as yet unidentified factor required to shut off 
the phage-specified nucleotidase activity (5). This shutoff is 
necessary for efficient replication to occur. Another early 
gene, lip, encodes a lipoprotein (18) whose function is to 
block superinfection of the host. This is accomplished in a 
process designated as lytic conversion (64) in which the 
lipoprotein interacts directly with the FhuA receptor. This 
process requires a large excess of the lipoprotein, at least 
in vitro, where the ratio of this protein to receptor needs 
to be in excess of 10:1 in order to fully block the interaction 
of T5 virions with FhuA (64). 

The SST DNA contains a large deletable section in the 
early gene region (del 2, 21.1-32.3% on the map) encoding 
24 tRNAs (76). These genes are presumably required to facil¬ 
itate efficient translation of phage mRNA required for viral 
replication. Host tRNA, produced prior to the extensive 
degradation of host DNA shortly after FST, can suffice to 
support modest replication rates upon infection with 
T5st(0) (a phage in which much of the tRNA encoding 
region has been deleted), but replication times become 
extended. A structural comparison of phage-encoded 
tRNAs suggested that they may possess some unusual 
features. For example, the tRNALeu and tRNATrp have a 
longer anticodon loop than the corresponding bacterial 
tRNAs and could act as frameshift mutation suppressors 
(76). This region also contains several open reading frames 
of unknown function. 

The SST DNA contains late, early, and pre-early promot¬ 
ers, the latter being on the second terminal repeat. However, 
expression of genes on the SST DNA is limited to the early 
genes for 8 minutes after SST. Only after certain early gene 
products have accumulated can the late genes become 
active, which occurs at about 12 minutes after infection. 
Early genes C2, D5, and possibly D15 are thought to be 
involved in regulation of T5 gene expression. The role of the 
D15 nuclease will be discussed later. The 90 kDa protein 
encoded by C2 appears to be weakly associated with host 
RNAP and could act as an alternative a factor, thereby shift¬ 
ing specificity from early to late promoters (16, 81, 82). The 
D5 gene product appears to have a more complex role, 
being involved in both positive and negative regulation as 
well as in DNA replication (16, 54). Biochemical studies on 
the purified D5 protein suggest that it is a highly soluble 
29 kDa protein, able to bind both double- and single- 
stranded DNA in vitro (68) and is associated with the T5 
transcriptiomreplication complex (22). 


A large region of the SST DNA is devoted to encoding 
genes involved in DNA replication. Cells infected with T 5 
produce more DNA packaged as phage than was originally 
present in the cell prior to infection. This large synthetic 
burden is carried out with the aid of many phage-specified 
proteins. These include DNA polymerase (the D9 gene 
product, formerly D7-8-9), 5'-3' exonuclease (D15), helicase 
(DIO), DNA ligase, dihydrofolate reductase ( dhr ), ribonu¬ 
cleotide reductase (possibly B1 or B2), deoxynucleoside 
monophosphokinase (dnk, C0.3) (57), thioredoxin and 
thymidylate synthetase (reviewed in 52). The dnk gene 
product has recently been purified from phage-infected 
cells and its biological activity and N-terminal sequence 
determined (19). 

The D15 exonuclease gene was sequenced by Kaliman 
et al. (40) and shown to encode a protein of 291 amino 
acids. A consensus E. coll promoter is located upstream of 
the D15 Shine-Dalgarno region. Overexpression of the 
enzyme has allowed the biochemistry and biophysics of this 
protein to be investigated (19, 72). The D15 nuclease func¬ 
tions as an exonuclease in the 5'-3' manner, releasing 
mono-, or short polynucleotides depending on the substrate. 
In addition to this activity, the enzyme is an efficient flap 
endonuclease, able to cleave bifurcated DNA possessing a 
free 5' arm (13). The D15 nuclease also possesses nanomolar 
binding affinities for DNA substrates with single-stranded 
arms (26). The crystal structure of this phage enzyme 
revealed the presence of a helical arch or clamp, thought to 
be involved in DNA binding or threading of substrates 
(13). Sequence analysis shows that this enzyme is highly 
homologous with the small fragment of DNA polymerase I, 
that is, amino acids 1-323 carrying the 5'-3' exonuclease 
activity of the polymerase holoenzyme. So far this is the 
only T5 protein whose crystal structure has been deter¬ 
mined (figure 19-3). 

It has been suggested that D15 may play a role in proces¬ 
sing T5 DNA containing pre-existing nicks by cleaving 
the intact strand opposite the nick (60). Biochemical studies 
on the overexpressed enzyme do suggest that, in addition 
to flap endonuclease activity, the enzyme is able to cleave 
in a purely endonucleolytic manner in a reaction requir¬ 
ing a nicked substrate (73). The same study provided some 
support for early reports suggesting that the D15 protein 
may be involved in switching on late genes (15). Although 
little further work has been reported in this area, the 
D15 nuclease is able to introduce gaps into nicked sub¬ 
strates in vitro. It has been suggested that single-stranded 
regions in replicating T5 DNA are needed for late gene 
expression (15). 

The D9 DNA polymerase has been isolated in the T5 
transcription-replication complex together with the D5 
and D15 polypeptides (22). The D15 nuclease is clearly 
involved in DNA replication, as had been long suspected 
(24, 25). Bacteriophage T4 contains a close homolog of T5 
D15 known as T4 RNaseH (38). The structures of these 
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Figure 19-3 The phage T5 D15 nuclease. A: Crystal structure of the D15 exonuclease or flap endonuclease. A conceptual 
model of a bound DNA flap substrate is shown (13, 19). B: Endo-nucleolytic (open arrow) and exo-nucleolytic modes 
(black arrow) of DNA cleavage and gapping reactions at a nick. 



Figure 19-4 Diagrammatic representation of a 15,000 bp region covering the junction between early (D genes) and late 
genes. This seguence was assembled from three sequences, accession numbers M64047, M24354, and AY543070. The 
position of a strong terminator of transcription (ter) is indicated. The D9 DNA polymerase is preceded by a promoter (not 
shown). D10 is a putative helicase, D11 and D14 have no obvious functions or homologs of known function. Genes D12 and 
D13 are thought to be part of a repair nuclease. The D15 exonuclease gene overlaps with the deoxyuridine triphosphatase 
( dut ) gene, both of which are served by a promoter just upstream. The long tail fibre gene (Itf) is transcribed in the opposite 
orientation to the early genes. In addition to the reading frames shown, the fragment contains numerous ORFs with little 
or no homology to any database deposits. An enlargement of a region defined by FcoRI and Bgl II restriction sites is shown. 
Tandem promoters P 31A and P 31B are separated by 100 bases (39). Their —10 sequences are both overlapped by a 20 bp 
direct repeat (triangles). P 31B possesses an 18 bp imperfect palindrome (diamond) that overlaps with the P 31B —35 element. 
A cryptic 94 amino acid ORF is situated downstream and must be present in order to clone fragments carrying P31B (39). 


enzymes are very similar indeed (3). This has been shown to 
be important for processing of Okazaki fragments from repli¬ 
cating DNA lagging strands (6). These RNA oligomers are 
removed largely through the action of the 5'-3' exonucleoly- 
tic function of the 5' nuclease. Together the D9 polymerase, 
and D15 5' nuclease proteins make up the equivalent of 
the E. coli DNA polymerase I holoenzyme, possessing 5'—3' 
exonuclease (D15 nuclease), polymerase, and proofreading 
activities (D9 polymerase). The D9 gene has been sequenced 
and shown to encode a polymerase with associated 3'-5' 
proofreading exonuclease activity as expected (50). This 
enzyme is highly processive and can carry out strand displa¬ 
cement synthesis, but extensive characterization of this 


enzyme has been limited due to lack of solubility of the 
over-expressed protein (14). 

A large contiguous sequence encompassing the region 
from the D9 polymerase to the end of one of the late genes 
(including the Itf gene encoding the long tail fiber protein) 
can be assembled from sequences M64047, M24354, and 
AY543070 (41,42, 50), which allows an unambiguous order¬ 
ing of several genes in this region (figure 19-4). The sequence 
also reveals the presence of tandem promoters P31A and 
P31B, which appear to control expression of a putative D12- 
D13 exonuclease, possibly involved in DNA repair (42). Inter¬ 
estingly, we were able to clone a fragment carrying these 
tandem promoters provided that an adjacent open reading 
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frame (ORF) was present on the cloned fragment of T5 DNA 
(see figure 19-4). This was possible despite the observations 
that the closely related D9 promoter could not be cloned on 
a multicopy plasmid due to the high strength of that pro¬ 
moter (14). Disruption of the ORF immediately downstream 
of P31B led to recovery of clones lacking the strong promoter 
P31B. Loss of this promoter was due to recombination 
between 20 bp direct repeats flanking P31B, leaving only 
the weaker P 31A promoter (J. Sorrell and J. R. Sayers, unpub¬ 
lished data). No obvious transcription terminator was 
present on this fragment. Kaliman et al. (39) reported similar 
observations when working with the same fragment. Stable 
cloning of strong T 5 promoters often requires the presence 
of a downstream terminator as described by Bujard and 
coworkers (28). We suggest that the ORF downstream of 
P31B might encode a repressor capable of inactivating this 
strong promoter, thereby providing a stabilizing influence 
on the recombinant plasmid (J. Sorrell and J. R. Sayers, 
unpublished data). 

Replication 

The replication of T5 DNA has not been as well studied as the 
replication of T4 and other phages. Since the last major 
review of theT5 group (56), little progress has been made in 
determining the structure of replication intermediates 
present in T5-infected cells. Replication was shown to 
proceed bidirectionally from multiple, internal origins. A 
primary origin of replication is located near the center of 
the genome. Significant numbers of circular T 5 DNA mole¬ 
cules are observed during the later stages of infection. These 
replicative circular molecules appear to be in either a theta 
or sigma configuration (9). Replication forks, loops, and 
circular structures, similar to those found in other large 
phages, were also observed by Everett (20). Phage capsid 
structures were associated with both mature phage-length 
DNA and concatemeric molecules (20). It also appears that 
host DNA gyrase is required for T5 DNA replication and 
for late gene expression (17). 

Late Genes 

The late genes encoding structural proteins reside to the 
right of the D15 gene and their order has been determined 
by Heller (32). The virion is composed of 15 or more proteins 
(32, 87). The most abundant is the major head protein 
encoded by D20-21 (32 kDa, 730 copies), N4 encoding the 
major tail protein (58 kDa, 120 copies) and the N5 gene 
product, producing another head protein (19 kDa, 114 
copies). In BF23 a minor tail protein (gene 24) with a mass 
of 34 kDa has been identified along with gene 25 which 
encodes a major tail protein of 50 kDa, possibly correspond¬ 
ing to gene N4 in T5 (62). The gene encoding the L-shaped 


tail-fiber (If/) has been sequenced and shown to direct 
the synthesis of a protein of approximately 148 kDa, in 
reasonable agreement with the previously reported mass of 
125 kDa (33, 42). Each virion contains six to nine copies 
of this bent tail fiber, which are thought to mediate revers¬ 
ible attachment to the host cell via interactions with poly¬ 
mannose O-antigens (30, 31). However, the existence of a 
viable deletion mutant indicates that this gene is not essen¬ 
tial (32, 67, 71). The bent tail fiber sequence shares strong 
sequence similarity with the side tail fiber protein of X 
phage and it has been suggested that there is extensive hori¬ 
zontal transfer of such genes between phages (29). In addi¬ 
tion to the oad and Itf genes, the only other sequenced 
region of the late genes appears to encode a protein with an 
unexpected and somewhat surprising sequence. A partial 
sequence from the late region of BF23 was determined and 
found to contain collagen-like repeats (77). Similar genes 
are present in other phages, despite the fact that no such 
collagen-like genes have been identified to date among 
the eubacteria. This raises the intriguing possibility that 
the phages have been able to accept gene transfer from 
the eukaryotes or have evolved the collagen repeat inde¬ 
pendently. The function of the collagen-like protein 
remains unknown, but it seems that such a protein is likely 
to have a structural function, particularly as it appears 
to be expressed from a late gene. 

Concluding Remarks 

Many features of phage T 5 remain enigmatic. What is the 
identity of the nuclease responsible for host cell DNA degra¬ 
dation, and what protects the phage DNA from its atten¬ 
tions? What are the molecular mechanisms governing the 
pause in DNA injection after FST and how is SST initiated? 
Which protein(s) introduces the DNA nicks and what is 
their role in vivo? How do phage proteins subvert the host 
RNA polymerase, changing its specificity in tune with the 
demands for pre-early, early, and late gene expression? How 
do phage proteins overcome hostile host functions such as 
restriction and methylation? Why has such an interesting 
phage remained so much a mystery? In the past this has 
been due to the difficulties of cloning large fragments of T5 
DNA due to promoter strength and the toxicity of its gene 
products. It is clear that the technical ability to sequence 
this phage without cloning exists. It has long been possible 
to sequence large double-stranded DNA such as X using the 
walking-primer approach (63). Two groups have indepen¬ 
dently sequenced the T 5 genome. A Russian group released 
their sequence in April 2004 (accession number AY543070), 
and French researchers afterwards shortly (accession 
number AY692264). The genome consists of 121,750 base 
pairs and some 162 genes are annotated. Access to the 
genome sequence will undoubtedly aid and stimulate 
researchers in the T 5 field, but a cursory analysis reveals a 
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large proportion of proteins of unknown function, many of 
which do not have any reasonably close homolog in the 
current databases (not even cryptic ORFs from other phages 
or bacteria). Moreover, T 5 possesses many ORFs not identi¬ 
fied on the genetic map. Even in 2005, most of the questions 
about T 5 biology, first posed 30 or more years ago, remain 
unanswered. 
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The T7 Group 
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B acteriophage T7 is the prototype member of the T7 
group of phages that are traditionally thought to be 
obligately lytic. Although isolated in 1945, early studies on 
T3 and T7 were much less extensive than those on the 
T-even phages. However, in 1969 a comprehensive genetic 
map of T7 was published (269, 276), several phage-coded 
enzymes involved in DNA replication and transcription 
were purified and studied biochemically (22, 27, 85, 96), and 
T7 rapidly became a focus of attention. These studies have 
continued to the present, the mechanism of transcription 
by T7 RNA polymerase is known in exquisite detail, and 
in vitro DNA replication catalyzed by T7 proteins is one of 
the best understood model systems. In more recent times, 
T7 RNA polymerase-based expression systems (279; also 
see chapter 43), DNA sequencing using Sequenase, and a 
T7 display system (229), have all made people extremely 
familiar with T 7 parts, if not with the phage itself. The ease 
of laboratory manipulation of T7, its fast growth rate, and 
of course the wealth of information that has accrued in 
the past decades, has also made T7 a prime model system 
for experimental molecular evolution and evolutionary 
genetics (17-19,48,100). 

Overview 

Bacteriophage T 7, together with its relative T3, was origin¬ 
ally isolated as a member of the seven Type phages that 
grow on E. coli B (53). Plaques of T7 and T3 (but not 
more distantly related phages) are characteristically large 
on most E. coli strains, and they continue to expand on 
extended incubation (319). Unlike Tl, which also makes 
large plaques, T7 and T3 do not cause persistent contami¬ 
nation in the laboratory as they die rapidly on drying. On 
wild-type hosts T7 plaque size generally decreases with 
decreasing incubation temperature, the relative efficiency 
of plating (eop) remains constant from 43°C to about 15°C, 
and an uncharacterized mutant forms plaques at 10° C 
at an eop ~1 (P. Kemp and I. J. Molineux, unpublished 
observations). 


Several T 7-like coliphages have been isolated that by 
morphology and hybridization studies are very closely 
related toT7 itself (e.g., figure 20-1A). Genome sequences of 
the closeknit T 7 group are thus far limited to T 7 and T 3, the 
yersiniophages (j)A1122 and 4>Ye03-12, and th e Pseudomonas 
putida phage gh-1 (62, 78, 135, 206, 207). All five exhibit 
a highly conserved genome organization and, at the 
sequence level, differ mainly in the presence or absence of 
several nonessential genes, many of which exhibit homology 
to homing endonucleases or other parasitic elements. In 
accord with the modular evolution of phages, it is becoming 
apparent that more distant relatives of T7 are abundant 
whose genomes are organized differently and only a few of 
whose genes are clearly homologous to T7. Among these 
are the dual capsule-specific coliphage Kl-5 and the Salmo¬ 
nella phage SP6 (56, 240), the P. aeruginosa phage c|)KMV 
(144), the Xanthomonas campestris pv. oryzae phage XplO 
(323), and the marine cyanophage P60 (66). The roseophage 
SI01 (224) and the Vibrio parahaemolyticus phage VpV262 
(90) are even further removed from T7. Interestingly, the 
Agrobacterium tumefaciens chromosome contains proteins 
with extensive similarity to T7 RNA polymerase (RNAP) 
and the T7 tail fiber but no other easily recognized T 7-like 
genes (GenBank AE007869). Even more surprising is the 
Pseudomonas putida KT2440 chromosome, which contains 
a putative prophage, almost all of whose genes show greatest 
similarity toT7,T3, or <)>Ye03-12 (201; GenBank NC.002947). 
Most essential genes of T 7 are present in the prophage, nota¬ 
ble exceptions being homologs to genes 0.7 and 2, which 
modify the host transcription and translational machinery, 
and the 17.5 holin. No repressor protein is obvious within 
the prophage and, understandably, host promoters that 
would allow expression of the phage RNAP are missing, as 
determined by sequence inspection. The prophage is flanked 
by a repeat sequence that inserted into tRNA-leu-1, and also 
contains a putative integrase. The presence of this almost 
complete copy of a T 7-like genome as a cryptic prophage 
complete with a putative integrase raises many questions 
regarding the evolution and ecology of what are referred to 
as obligate lytic phages. 
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Figure 20-1 Phage T7 and T7-lil<e virions. A: Electron micrographs of T7, SP6, and K1-5. The internal core is visible but in T7 
and K1-5 it has detached from the head-tail connector. Note also the different tail structure between T7 and SP6 or K1-5. 
The thin T7 tail fibers are not apparent. Scale bars represent 50 nm. Courtesy of S. Adhya and D. Scholl. B: Cryo-electron 
micrograph of a tailless T7 head showing the “concentric ring” pattern of the densely packed spooled DNA. The scale bar 
represents lOnm. Courtesy of M. Cerritelli. C: Schematic of the T7 virion indicating its protein components. 


As more sequence information becomes available, any 
distinct differences between T 7 or its immediate family and 
more distantly related phages will likely become even more 
blurred. However, to date no T 7-like phage has been found 
to infect Gram-positive bacteria. This chapter will focus 
mainly on T7 although significant differences that are 
known in other phages will be emphasized. Earlier reviews, 
especially those by Kruger and Schroeder (137) and Dunn 
and Studier (62), should also be consulted. 

The primary distinguishing features of the T7 group 
have been that they are members of the Podoviridae and 
code for a rifampicin-resistant, single-polypeptide RNAP 
that is responsible for most phage gene expression. However, 
this classification may need re-evaluation as XplO contains 
a T7-related RNAP but has X morphology, and the T 7-like 
marine phages SIOl and VpV262 lack a phage-specific 
RNAP. In addition to host range differences, the closer 
relatives of T7 can be grouped by the promoter specificity of 
the phage polymerase. Members of the same group have 
a common promoter sequence: these phages recombine 
extensively with each other but much less efficiently with 
members of a different group. 

Aside from the translational apparatus and the biosyn¬ 
thetic machinery for precursor synthesis, T7 growth is 
remarkably independent of host enzymes. E. coli RNAP is 
used to make early RNAs, but most transcription is cata¬ 
lyzed by theT7 enzyme and, except for thioredoxin, both 


DNA replication and recombination are independent of 
host proteins. 

The T7 Virion 

A spherical approximation of the icosahedron, T = 7, T7 
capsid has a diameter of 60-61 nm (figure 20-1A), enclosing 
a volume of about 10 s nm 3 (227, 267). The T7 genome occu¬ 
pies about 45% of the head volume. The 40 kb double- 
stranded DNA is in B-form, has a center-to-center helix 
spacing of 2.4 nm, and is spooled around the internal core 
in about six coaxial shells (23) (figure 20-1B). A genome as 
large as 103% and as small as 85% unit length can be pack¬ 
aged into infectious particles. The capsid shell consists of 
415 molecules of gplO, about 95% of which is normally the 
major capsid protein, gplOA. The remainder is the minor 
capsid protein, gplOB, which results from a —1 reading- 
frame shift near the 3' end of gene 10A (44, 45, 62). The 
growth of mutant phages that make only gplOA is indistin¬ 
guishable from wild-type, but plaques of phages that make 
only gplOB are smaller than usual. T3 (43, 44, 45), and 
several other members of theT7 group, are also thought to 
produce a translationally shifted second capsid protein (95), 
although phages gh-1, SP6, and Kl-5 do not (135, 240). 

At one vertex of the gplO icosahedron the dodeca- 
rneric head-tail connector (gp8) is inserted (figure 20-1C). 
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Expression of gene 8 clones results in mixtures of oligomers 
that contain 12 or 13 molecules of gp8 (25,132,298), but only 
12mers are found in mature virions or within filled heads. 
The cryo-electron microscopic structure of the dodeca- 
meric cloned T3 connector reveals a similar structure to 
the (j)29 and SPP1 connectors, a 12-lobed wide domain that 
is inserted into the prohead and a narrower domain that 
interacts with the tail (297, 298). An open channel runs 
through the cloned connector, but the channel is closed 
when the connector is isolated from virions (57). In the 
latter work the channel through which the genome passes 
has an average diameter of about 3.7 nm but with a constric¬ 
tion down to about 2.2 nm. The constriction is close to that 
of the diameter of duplex DNA, and may be one reason why, 
in contrast to many other phages, there is no transient 
leakage of ions following T7 infection (138). The change 
in diameter of the channel may be due to conformational 
changes in the connector associated with DNA packaging 
during assembly or (perhaps in addition) to incorporation of 
gp6.7 into the head. The product of gene 13 was once thought 
to be part of the virion but is actually required to incorporate 
gp6.7 (which migrates similarly to gpl3 in many SDS gel 
systems) into heads. Gp6.7 may be essential, but morphologi¬ 
cally normal particles are formed in its absence. 

Three other head proteins form an internal, cylindrical 
structure about 26 nm long by 21 nm wide that is attached 
to, and coaxial with, the 12-fold symmetrical connector 
(241, 243). The core consists of coaxially stacked rings and 
exhibits 8-fold symmetry (26); 4 copies of gpl6, 8 of gpl5, 
and 8-12 of gpl4 are currently thought to comprise the 
structure. The head also contains 15-20 copies of gp6.7 
and a small amount of the nonessential gp7 (P. Kemp and 
I. J. Molineux, unpublished observations). Their location 
with respect to the core is not yet known (figure 20-1C). 

The stubby T7 tail is 23 nm long, tapering from 21 nm 
wide at the connector to 9 nm at the distal end (262). The 
tail exhibits 6-fold symmetry and is estimated to consist of 
12 copies of gpll and 6 copies of gpl2. About 30 copies of 
gp7.3 are also part of the tail but their precise location has 
not been determined. The six tail fibers, each consisting of 
three parallel gpl7 molecules, are attached, through an 
N-terminal domain of the protein, near the top of the tail 
(115,263). 

Genetic Organization 

TheT7 genome contains 39,937 bp and codes for 56 known 
or potential genes (62,185) (table 20-1; figure 20-2). A160 bp 
direct repeat lies at each end of the T7 genome, the length 
and sequence of the terminal repeat (TR) varying among 
different members of theT7 group. Although circular mole¬ 
cules can be formed in vitro after exonuclease digestion, 
they are not found in vivo. Rather, linear concatemers, 


containing only a single copy of TR between genomes, are 
the initial products of DNA replication. 

T7 DNA contains only the four normal bases. A limited 
amount of adenine methylation occurs from the type I 
restriction-modification enzyme and from Dam and Dcm 
methylases, but not all potential sites are fully methylated 
because the rate of T7 DNA replication and packaging is 
so high. Some of the other T 7-like phage DNAs, including 
that of T3, are completely unmethylated since they code for 
an S-adenosylmethionine hydrolase, which removes the 
methyl group donor. 

Sequences corresponding to many type II restriction 
enzymes with palindromic recognition sites are grossly 
underrepresented in the genomes of the T7 phage family. 
Avoidance of palindromic sequences is the primary mecha¬ 
nism used to avoid restriction by type II enzymes. T7 is 
also sensitive to restriction by type III enzymes and thus 
does not grow in PI R + M + lysogens. However. T7 growth 
is insensitive to EcoP15 as the restriction enzyme requires 
two copies of 5'-CAGCAG/5'-CTGCTG in opposite orienta¬ 
tions (176), and all 36 copies of this hexamer have the same 
polarity in theT 7 genome. T 3 is restricted by EcoP15 since its 
genome contains the recognition sequence in both orienta¬ 
tions. The genomes of all immediate members of the T7 
group examined to date contain remnants of homing endo¬ 
nucleases or related mobile parasitic elements. Although 
some are inserted within coding regions, most are present 
as apparently precise insertions between genes but they 
are not always present at the same locations in different 
genomes. These genetic elements have no known function 
in phage development, although a frameshift mutation in 
T7 gene 2.8 is translationally polar on gene 3 (225). 

In vitro, the Escherichia coli RecBCD nuclease rapidly 
degrades the linear T7 genome. T7 gp5.9 binds stoichiome- 
trically to RecBCD holoenzyme, inhibiting both its nuclease 
and ATPase activities (153). However, gene 5.9 is nonessen¬ 
tial in RecBCD-containing cells, and gp5.9 is not found in 
the phage particle. The infecting T7 genome is most likely 
protected from RecBCD by one or more of the proteins 
ejected from the virion or by a cellular component. 

All transcription is from left to right on the T7 genetic 
map; genes and regulatory elements are defined by their 
position on the nontranscribed strand (figure 20-2). Genes 
are numbered sequentially in the order that they are tran¬ 
scribed initially; T 7 promoters (prefix <f>), and RNase III pro¬ 
cessing sites (prefix R) are usually named by the gene 
immediately downstream. Whole-number genes were ori¬ 
ginally defined by complementation tests as being essential 
(269), whereas most noninteger genes were first defined 
as being nonessential or as having conditionally essential 
activities. More recent studies have shown that genes 2.5, 
6.7, and 7.3 may be considered essential, and that gene 7 
is nonessential (table 20.1). In addition, but for unknown 
reasons, gene 2 is not essential on E. coli C strains (273). 



Table 20.1 T7 genes 


Class 

Function 3 

Selected 

references 

Class 1 

0.3 

B-DNA mimic; anti-type 1 restriction 

301 

0.4, 0.5, 0.6A, 0.66 

Not conserved; nonessential 


0.7 

Protein kinase; host-transcription shutoff; Col lb 

84 

217 

7 

exclusion 84, 211 

T7 RNA polymerase 

159, 287, 321, 329 

7.7 

Conserved, nonessential 


7.2 

£. co/7 dCTPase inhibitor; F-exclusion 

188, 190, 233, 235, 237 

7.3 

DNA ligase 

271, 262 

Class II 

7.4, 7.5, 7.6 

Not conserved, nonessential 


7.7 

Full-length gene not conserved; beneficial for growth 


7.8 

Poorly conserved, nonessential 


2 

£. co/7 RNAP inhibitor 

200 

2.5 

SSB 

128, 133 

2.8 

Not conserved, nonessential; homing endonuclease? 


3 

Endonuclease 1, FHolliday junction resolvase 

51, 122 

3.5 

Amidase (lysozyme); regulates T7 RNAP activity 

329 

3.8 

Not conserved, nonessential; homing endonuclease 


4A 

Primase-helicase; gp4B helicase from internal in-frame start 

69 

4.7, 4.2 

Overlappons; not conserved 


4.3, 4.5 

Conserved, nonessential 

275 

4.7 

Not conserved, nonessential 

275 

5 

DNA polymerase 

58, 148, 149 

5.3 

Not conserved, nonessential, homing endonuclease 


5.5 

Conserved, nonessential, binds £. co/7 FINS; itrex exclusion 

153, 154 

5.7 

Non-conserved -1 frameshift leads to T7 5.5-5.7 fusion 
Conserved, nonessential 


5.9 

Inhibits RecBCD nuclease, nonessential, not conserved 

153 

6 

5'^3'double-stranded exonuclease, RNase H 

147, 251 

6.3 

Poorly conserved, nonessential 


Class III 

6.5 

Conserved, nonessential 


6.7 

Virion protein; ejected into infected cell 

119 

7 

Nonessential, not conserved; host range 

62, 273 

7.3 

Essential virion protein; ejected into infected cell 

(P. Kemp and 1. J. Molineux, 

7.7 

Not conserved; homing endonuclease 

unpublished observations) 

8 

Ffead-tail connector protein 

25, 298 

9 

Scaffolding protein 

24 

10A 

Major capsid protein; -1 frameshift yields minor capsid 

24, 45, 188, 190 

77 

protein gplOB F exclusion 

Tail protein 

262 

72 

Tail protein 

262 

73 

Essential; required for gp6.7 incorporation in virion 

(P. Kemp and 1. J. Molineux, 

74 

Internal core protein; ejected into infected cell 

unpublished observations) 
189 

75 

Internal core protein; ejected into infected cell 

189 

76 

Internal core protein; ejected into infected cell 

183, 184, 189, 268 

77 

Tail fiber protein 

115, 263 

77.5 

Class II holin 

302 

78 

Small terminase subunit 

89, 309 

78.5-78.7 

Conserved; X Rz-Rzl homologs 

21 

79 

Large terminase subunit 

193, 195, 310 

79.2, 79.3 

Overlappons, conserved 


79.5 

Nonessential, conserved 

126, 127 


Conserved or not conserved refers to close relatives of T7. 
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Class I Class II Class III 

early DNA metabolism morphogenesis 



10 20 30 39.937 kb 

Figure 20-2 Schematic of the genetic organization of bacteriophage T7. Gene expression is from left to right and is 
temporally controlled, that is left-most genes are transcribed first, right-most genes last. Only genes with known function 
are named. T7 promoters are indicated by horizontal arrowheads on continous vertical lines. The three major E. coli 
promoters (combined into one) are indicated by an arrowhead on a dashed vertical line. Transcription terminators are 
represented by filled circles for E. coli RNA polymerase (dashed line) and T7 RNA polymerase (continous); the primary origin 
of DNA replication is indicated. Arrowheads below the map represent sites of RNase III processing. For the exact 
locations of all genetic elements see Dunn and Studier (62) and Genbank NC_001604. 


Genes are close-packed, and about 92 % of the genome is 
coding sequence; the arrangement of close-packed and 
slightly overlapping genes in T7 may make translational 
coupling quite common. There are five cases of substantial 
or complete gene overlap, where the second gene is in 
a different reading frame. Genes 4.1 and 4.2 lie almost 
entirely within gene 4, gene 18.7 lies within gene 18.5, and 
genes 19.2 and 19.3 both overlap gene 19. In addition, gene 
4 codes for two proteins via an internal in-frame initiation; 
the longer gp4A is both a primase and a helicase whereas 
gp4B has only helicase activity. Three cases of program¬ 
med translational frameshifting have been proposed in T7, 
but only the —1 shift that produces the gplOB protein has 
been characterized (44, 45, 62). 

A time course of protein synthesis reveals three classes of 
gene products; class I. or early, genes are synthesized from 
about 2 to 8 minutes post-infection, class II genes from 
about 6 to 15 minutes, and class III genes from 8 minutes 
until lysis, which begins after 25 minutes at 30°C (270). 
The 10 class I genes, 0.3—1.3, are transcribed by E. coli 
RNAP but only one,T7 RNAP, is essential. Genes 1.1 through 
6.3 comprise the class II region and genes 6.5-19.5 the class 
III region. Both class II and III genes are transcribed by 
T 7 RNAP. In general, class I genes serve to establish favor¬ 
able conditions in the cell for phage development, class II 
genes are involved in DNA metabolism, and class III genes 
have DNA packaging, virion assembly, and lysis functions. 

Adsorption and Host Range 

The host range of wild-type T7 appears to be restricted to 
E. coli, rough S. typhimurium LT2, and various Shigella 


strains (T7 undergoes an abortive infection in Shigella 
sonnei D2-3748; 6, 94, 212). Extended host range mutants of 
T 7 also allow growth in Yersinia pestis and Yersinia pseudotu¬ 
berculosis, and vice versa. Comparable mutants of Y. pestis 
4>A1122 grow in E. coli (78; E. Ramunculov, M. C. Chu and 
I. J. Molineux, unpublished observations). Extended or 
altered host range mutants of the T7 group of phages are 
readily isolated, and "adaptation” of phages to a variety of 
different hosts was once a common practice (e.g., 145). This 
practice, whether intentional or otherwise, has resulted in 
some confusion. The Y pestis phage <j)A1122 has been des¬ 
cribed as a coliphage without indication that a host range 
mutant was being used, and even T 7 and T3 have been mis- 
identified by more than one laboratory (274). As many spon¬ 
taneous extended host range mutants infect their new 
host less efficiently than they infect their original host, any 
inadvertent use of mutants can be misleading. Tail fiber 
interaction with lipopolysaccharide (LPS) is a necessary 
early step in infection, and as LPS structures are themselves 
rather variable, the tail fiber is likely a malleable structure 
that easily adapts to a variety of receptors. 

On E. coli B strains, T3 adsorbs to the penultimate glu¬ 
cose of the LPS (213), the same residue as T4. As a conse¬ 
quence, B/3 strains, which have lost both terminal glucose 
molecules, are also B/4 but remain T7 S ; the initial receptor 
for T7 is deeper within the inner LPS core and B/7 strains 
are obligatorily B/3,4. T7 grows normally on most E. coli C 
and K-12 derivatives, T 3 also grows on E. coli C but fails to 
adsorb to many common laboratory K-12 strains. Extended 
host range mutants with an altered tail fiber (gpl7) over¬ 
come the adsorption defect. Neither T 7 nor T3 form plaques 
on smooth E. coli strains that contain a complete outer core 
LPS, nor do they plaque on capsulated strains. However, the 
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Salmonella phage SP6 grows on both rough and smooth 
strains. Phages related to T7 or SP6 grow on various K 
antigen-containing E. coli strains (39, 140, 141, 155, 210, 
239). Remarkably, each virion of the SP6-related coliphage 
(j)Kl-5 contains two sets of tail fibers, one of which adsorbs 
to the Kl, the other to the K5 capsule (239). The amino acid 
sequence of the tail fiber of <j)KlF is related to that of T 7 at its 
N-terminal end, the middle domain, which is homologous 
to the tail fibers of other Kl-specific phages, contains the 
endosialidase motifs. The sequence of the Klebsiella phage 
Kll tail fibers is not yet known but it has been shown that 
the capsule-degrading activity resides in the tail structure 
(231). The C-terminal sequences of the 4>K1F tail fiber exhi¬ 
bit patches of similarity to other tail fibers, including those 
of morphologically unrelated phages, an example of mosaic¬ 
ism among phage adsorption organelles (87). 

The conservation of N-terminal tail fiber sequences 
among T7 group members is consistent with this region of 
the fiber binding to the tail (115, 263), and sequence diver¬ 
gence at the C-terminal end suggests that the latter corre¬ 
sponds to the LPS-binding ligand. This simple idea fits with 
the mutational sites inT3 host range mutants (selected on 
E. coli strains that do not adsorb T3 + ) being located close 
to the 3' end of the gene (M. I. Pajunen, I. }. Molineux and 
M. Skurnik, unpublished observations). However, some 
(j)A1122 extended host range mutants that grow on both 
E. coli and Y. pestis are altered near the N-terminus of 
gpl7 (78). 

Distinct from the failure to adsorb or the susceptibility 
to endogenous restriction enzymes, T 7 fails to productively 
infect a number of natural E. coli strains. In some cases, 
normally nonessential genes allow growth and thus only 
the mutant phage is excluded, but growth in S. sonnei 
D2-3748 requires T7 to contain a missense mutation in the 
gene for the major capsid protein (212). The best understood 
T7 abortive infection, which bears some similarities to 
the classic system of T4 rll mutant exclusion by X rex + lyso- 
gens, is the failure of T7 to grow in cells containing the 
conjugative plasmid F (164). F exclusion is found using 
most close relatives of T7 (including SP6; 331) but, notably, 
not T3. 


DNA Penetration 

After tail fiber attachment to LPS, the tip of the tail likely 
binds a cellular component, but any specificity to the inter¬ 
action has not been demonstrated. However, an expansion 
of the phage SP6 host range is conferred by mutant tail 
proteins (J. Kieleczawa and I. J. Molineux, unpublished 
observations), suggesting that an interaction may be an 
important step in the adsorption process. Gp7.3 and gp6.7 
are ejected from the virion into the outer membrane where 
they get degraded (120). This step likely corresponds to the 
beginning of the eclipse phase as particles lacking either 


protein are noninfectious. The internal core structure then 
disaggregates. Its components pass through the connector 
and the tail and then enter the cell (189). The gpl4 molecules 
ejected from the infecting particle have been localized to 
the outer membrane, where they presumably form a chan¬ 
nel, but no structural information is currently available. 
Gpl6 is likely the next protein to enter the cell, with its 
N-terminal sequences leading. The N-terminal portion of 
gpl6 is homologous to the catalytically active region of 
E. coli soluble lytic transglycosylase (183) and gpl6 has been 
shown to possess peptidoglycan hydrolase activity (184). 
This activity is essential for phage growth when cells are in 
the late logarithmic stage of growth or are growing at low 
temperature. The natural crosslinking of the Gram-negative 
cell wall is a barrier to diffusion of molecules greater than 
approximately 50 kDa (52). The conditions where gpl6 
hydrolase activity is essential for phage growth are those 
where the peptidoglycan is thought to be more highly cross- 
linked and efficient infection requires enlarging a hole 
across the cell wall. 

Membrane fractionation of T7-infected cells suggests 
that the gpl5 and gpl6 ejected from the infecting particle 
span both the outer and cytoplasmic membranes. It is likely 
that the N-terminal region of gpl6 abuts the presumptive 
gpl4 channel across the outer membrane and extends 
across the periplasm (R Kemp, C.-Y. Chang, L. R. Garcia and 
I. J. Molineux, unpublished observations). Although topo¬ 
logical studies have not been conducted, it is currently 
thought that a C-terminal domain of gpl6 spontaneously 
inserts into the cytoplasmic membrane from its outer face, 
thereby making a channel for DNA translocation. The 
C-terminus of gpl6 interacts with gpl5 (C.-Y. Chang and 
I. J. Molineux, unpublished observations), which suggests 
that the latter may be proximal to the cell cytoplasm. Both 
gpl5 and gpl6 are large proteins (84.2 and 143.8 kDa, 
respectively) and therefore cannot maintain an extensive 
tertiary structure as they exit the virion, through the 
connector and tail, in order to enter the cell. About 50% of 
gpl6 is predicted to be made up of short a-helices inter¬ 
spersed by random coils. It was suggested that gpl6 might 
be comparable to a folding carpenters rule, where each 
segment of the rule corresponds to an a-helix. By opening 
out from a folded rule conformation, gpl6 may be able 
to exit the capsid as an extended molecule while retaining 
sufficient structural information to refold rapidly in the 
periplasm (189). 

The combination of gpl4, gpl5, and gpl6 thus forms 
a channel from the tip of the phage tail across the entire cell 
envelope, providing the genome access to the cell cytoplasm. 
T7 and its close relatives, which all code for homologous 
proteins, may therefore be described as having extensible 
tails, in contrast to the contractile tails of the Myoviridae. 

When the trans-envelope channel has formed, about 
850 bp of the 40 kb T7 genome enters the cell. There is 
a block to further efficient genome internalization that 
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is normally removed by transcription (79, 81). The value of 
850 bp results from the plating behavior, at temperatures 
below 30°C, of T7 deletion mutants that lack the major 
promoters (Al, A2, A3), and is comparable to estimates 
obtained in earlier studies at 30°C and 37°C using other 
approaches (187, 324, 325, 326). At 30°C, the block in the 
initial transcription-independent DNA translocation is not 
absolute, since at least 4 kb of all infecting genomes enter 
a rifampicin-treated cell within about 30 minutes (268), and 
after 24 hours of infection about 20% of infecting genomes 
are completely internalized. At least two additional major 
barriers to this transcription-independent DNA trans¬ 
location are thought to exist. There are no known DNA 
sequences that create a barrier to genome internalization. 
The block after 850 bp has entered is thought to result from 
a measurement of DNA length (81), whereas the other major 
barriers to DNA translocation have not been precisely 
located. Even after 24 hours of infection only about 20% of 
infecting genomes are completely internalized in rifampi¬ 
cin-treated cells. 

Mutations affecting a central region of gpl6 allow com¬ 
plete genome entry in the absence of any transcription at a 
constant rate of approximately 70-75 bp per second at 30°C 
(268). The mutants were isolated by selection for a gain-of- 
function phenotype but all mutant proteins direct the same 
rate of genome internalization, suggesting that they have 
simply lost the ability to block transcription-independent 
DNA translocation. The rate of DNA translocation is 
highly temperature-dependent and the data fit Arrhenius 
kinetics—a result that is inconsistent with the process 
being driven by entropy or by energy stored in the pack¬ 
aged DNA within the virion (118). Particles containing an 
altered gpl5 may exhibit modified kinetics of genome inter¬ 
nalization, but it is not yet known exactly how (or whether) 
gpl5 and gpl6 function together in translocating DNA. 

DNA translocation also requires a membrane potential. 
If the latter is collapsed while the genome is actively travers¬ 
ing the cytoplasmic membrane, then no further DNA trans¬ 
location is possible. Thus, the membrane potential appears 
to provide the energy necessary to move DNA from the 
virion into the cell, but it is also necessary for forma¬ 
tion and stability of the channel across the cytoplasmic 
membrane. Although neither gpl5 nor gpl6 contain ATP- 
binding motifs, it is not clear whether ATP is also necessary 
for transcription-independent DNA translocation from the 
virion across the membrane. 

The leading 850 bp of T7 that enters the cell by the 
above process contains theT7 <j)OL promoter and the strong 
E. coli promoters, Al, A2, and A3. Transcripts from T7 4>0L 
have not been detected during infection (T3 (j)0L transcripts 
are seen; 1) but <j)0L can direct genome entry if T7 RNAP 
is present in the cell prior to infection (79, 187). Normally, 
however, E. coli RNAP recognizes one or more promoters 
by transcribing the class I region and internalizes at least 
7 kb of the genome at approximately 40 bp per second at 


30°C. All three promoters are used at about the same effi¬ 
ciency in vivo under normal laboratory conditions (60, 272). 
In vitro, A3 is the strongest promoter at low temperature 
but at 37°C Al is strongest (49). Perhaps A2 is most active 
under other, nonstandard conditions, and the three different 
promoters ensure efficient genome entry (as well as messen¬ 
ger RNA (mRNA) synthesis) under a variety of environmen¬ 
tal conditions. Although E. coli RNAP is normally considered 
to terminate transcription at the early gene terminator, 
TE, termination is not 100% efficient and the entire T7 
genome can be internalized using transcription by the host 
enzyme. A consensus boxA sequence lies downstream of 
the promoters, and nus-dependent transcription antitermi¬ 
nation at box A is necessary for complete genome internaliza¬ 
tion by RNAP in the absence of phage protein synthesis 
(W. P. Robins and I. }. Molineux, unpublished observations). 
Neither nusA, nusB, nor boxA is essential for normal phage 
growth, but an antitermination system that also reduces 
RNAP pausing could explain why the rate of T7 genome 
internalization by E. coli RNAP is faster than the average 
rate of cellular mRNA synthesis. Multiple transcripts origi¬ 
nating from the A promoters allow synthesis of T7 RNAP, 
which normally completes the genome entry process at an 
estimated 200-300 bp per second at 30°C (79). 

Complete internalization of theT7 genome in a produc¬ 
tive infection takes about 10 minutes, approximately twice 
the time calculated from the measured rates of DNA translo¬ 
cation. However, this calculated value does not take into 
account the time taken to assemble the cell envelope 
channel for DNA transport nor any delay in promoter recog¬ 
nition and genome internalization by the newly synthesized 
T7 RNAP. The slow entry of the T7 genome into the cell 
provides an obvious mechanism for T7 gene regulation, as 
early genes are the first, and late genes the last, to enter the 
infected cell. However, patterns of gene expression and 
phage yields are fairly normal, albeit a few minutes faster 
than usual, when T7 DNA is rapidly introduced into cells 
from a X virion (82). How this is achieved is not under¬ 
stood, but the presence of the strong late T7 promoters in 
the cell does not prevent the weaker class II promoters from 
functioning. 

Gene expression and phage yield are also fairly robust to 
a significant change in gene order (65). The major biologi¬ 
cal reason for coupling transcription to genome internaliza¬ 
tion is to ensure that the first gene product to be made, 
gp0.3—an antirestriction protein—functions before type I 
enzymes recognize their cognate sites on the entering 
DNA. T7 is normally resistant to type I restriction but 
becomes sensitive when its genome enters the cell from 
a X virion (82). 

T3 is also known to internalize its genome via transcrip¬ 
tion (208, 324). The close genetic similarity of the Yersinia 
phages (j)A1122 and <j)Ye03-12, and perhaps phage SP6 and 
the Pseudomonas phage gh-1, to T 7 and T 3 implies that their 
genomes are also internalized by transcription. However, 
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P. aeruginosa RNAP cannot internalize the 4>KMVgenome by 
a comparable process. The time taken for host RNAP to tran¬ 
scribe the phage RNAP from near the genome end is about 
the entire latent period. Thus host RNAP-catalyzed genome 
internalization is unlikely to be a distinguishing feature of 
theT7 family. 

Early Gene Expression 

The three A promoters near the left genome end are among 
the strongest known and direct essentially all transcrip¬ 
tion by E. coli RNAP in vivo. Ln their absence the minor B or 
C promoters can be used to drive transcription-mediated 
genome entry and synthesis of the gene 1 mRNA. Several 
other E. coli er 70 promoters have been found on the genome 
by in vitro studies, or inferred from the properties of plas¬ 
mids containing T7 DNA. None seem likely to have signifi¬ 
cance in vivo, however. Antitermination of transcripts at 
a boxA element between the A promoters and the first gene, 
0.3, is likely the reason why no amber mutants exhibiting 
transcriptional polarity have been found in T7. The termina¬ 
tor TE downstream of gene 1.3 stops most transcribing 
complexes and thus defines the end of the class I, or early 
region of the genome (123, 270). Termination at T7 TE is 
Rho-independent both in vitro (28) and in vivo but its effi¬ 
ciency is dependent on phosphorylation of RNAP by the 
gp0.7 protein kinase (211, 330). Interestingly, T 3 TE, which 
is almost identical by sequence to T7 TE, terminates tran¬ 
scription efficiently in vivo but not in vitro. A novel tran¬ 
scription factor, tau, is required for termination at T3 TE 
and it also enhances termination at T7 TE (15,16). Transcrip¬ 
tion that reads through TE terminates specifically (but also 
inefficiently) near the 3.8 promoter and RNase III recogni¬ 
tion site, and at T4>, the terminator for T7 RNAP. E. coli 
RNase III processes the approximately 7 kb early RNA at 
five sites. Processing upregulates expression of gene 0.3 (61) 
but has little effect on other genes (early or late) and phage 
growth is almost normal in RNase III (me ) null mutants. 

As a consequence of gene 0.7 activity, host RNA and 
protein synthesis have stopped by about 4 minutes of infec¬ 
tion at 30° C, and as a consequence early T 7 gene expression 
has ceased by about 8 minutes (173). T7 RNAs have been 
reported to be stable (166, 282), or chemically stable but 
functionally unstable (315-317). Competition for ribosomes 
of the 0.3 mRNA by the more abundant late mRNAs was 
proposed to explain the shutoff of gp0.3 synthesis (264- 
266) but discrimination between early and late mRNAs 
was not observed in another study (312). Gene 0.7 positively 
regulates RNase III activity (172), which should stabilize 
mRNAs; furthermore, gene 0.7 also inhibits RNase E (165), 
the major mRNA degrading activity in E. coli. The issue of 
T7 mRNA stability and translational discrimination clearly 
warrants further investigation. 


Early Proteins 

The functions of five early proteins are known in detail. 
Gene 0.3 is expressed at very high levels, and protects T7 
from various type I restriction systems (2, 3, 273). Gp0.3 has 
been crystallized (301); the protein is a dimer and mimics 
the shape of B DNA. Gp0.3 has been shown to bind the 
EcoKI restriction enzyme in a 2:1 complex and to displace 
the enzyme from its target DNA (2). Although T7 gp0.3 
shares little sequence similarity with T3 gp0.3, the latter 
not only provides the same activity as T7 gp0.3 but also 
hydrolyzes AdoMet, the methyl donor for DNA modification 
(278). BoththeT7 andT3 types of gene 0.3 are found in other 
T7 group members. 

Gene 0.7 codes for a protein that inactivates host- 
catalyzed transcription. It is also a serine-threonine protein 
kinase (216). The two activities are separable (253). Early 
T7 RNA and protein synthesis is prolonged after infection 
by 0.7 mutants. Gp0.7 phosphorylates many host proteins, 
including ribosome components (220, 221), ribonucleases 
III and E (165, 172), and the |3 and (3' subunits of E. coli 
RNAP (330). The BR3 (271) and Y49 strains (98) contain the 
rpoC-E1158K mutation (200), which renders 0.7 essential 
for phage growth. The mutation affects the region of (3' to 
which the T7 protein gp2 binds. Gene 0.7 is nonessential 
in wild-type hosts and is actually detrimental when T 7 is 
grown at 30° C in rich media. However, phage growth at ele¬ 
vated temperatures or in minimal media, or in cells harbor¬ 
ing the Col lb plasmid, renders gene 0.7 essential (84, 102). 
The processes underlying the conditional necessity for 0.7 
function have not been determined. 

Gene 1.2 inhibits E. coli dGTPase and converts it into 
an rGTP-binding protein (5,105,198), which is important for 
T7 only when only the dGTPase (dgt ) gene is overexpressed 
(optAl mutation). T7 gene 1.2 also causes phage exclusion 
by the F plasmid, but T3 gene 1.2 overcomes the exclusion 
system (188). The ATP-dependent DNA ligase (gene 1.3) is 
only essential in E. coli lig mutants (271), but its presence 
enhances phage growth in wild-type hosts. The protein was 
the first of the nucleotidyl transferase superfamily to be 
crystallized (280). 

T7 RNA polymerase 

T7 RNAP (gene 1) has been crystallized (257); structures 
have also been obtained with gp3.5 lysozyme, and with 
DNA in both initiation and elongation complexes (30, 31, 
114, 287, 290, 320, 321). T7 RNAP is a single-chain enzyme 
of 883 amino acids that is related by sequence to mitochon¬ 
drial and chloroplast RNAPs, and by structural considera¬ 
tions to the DNA polymerase I family. AY693F change inT7 
RNAP allows efficient usage of dNTPs as substrates (258). T7 
RNAP and its interactions with both DNA and RNA have 
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also been intensively studied by mutational and crosslinking 
approaches and by kinetic analyses. 

The 23 bp T7 promoter can be divided into a binding 
region, which extends from —17 to —5 and includes the 
determinants for promoter specificity, and an initiation 
region that extends from —4 to +6 (111). Both values are 
relative to the transcription start site at +1. T7 RNAP binds 
to one face of the closed DNA duplex promoter (294, 295) 
via the specificity loop (amino acids 739-770) (30) and an 
AT-rich recognition loop (amino acids 93-101). Promoter 
opening between base pairs —5 and —4 is thermodynami¬ 
cally driven (4), being favored by promoter bending (296) 
and by a (3-hairpin of RNAP that includes V237 (260). Initia¬ 
tion of transcription is preferentially with GTP (167) and is 
very fast, rates of approximately 3 per second having been 
observed in vivo (156). The unstable initiation complex is 
converted into a stable elongation complex when the nas¬ 
cent RNA has reached approximately 12-14 nucleotides in 
length (179). Conversion involves an interaction of the speci¬ 
ficity loop (289) and is accompanied by a major change in 
enzyme conformation (110, 160, 168, 259). This can be seen 
by comparison of the structures of RNAP as initiation and 
elongation complexes (30, 287, 320). The rate of mRNA syn¬ 
thesis catalyzed by T7 RNAP is at least 5-fold faster than the 
E. coli enzyme and has been estimated to be 200-240 bases 
per second at 37°C, both in vitro and in vivo (14,163, 329). 

Although T7 RNAP is highly specific for its promoter, 
natural promoters vary in sequence. The 17 promoters 
on the T7 genome can be divided into three groups, six 
consensus class III promoters (including <j)OR), 10 class II 
promoters that vary from consensus by two or more posi¬ 
tions, and the promoters close to each genome end, desig¬ 
nated (j)OL and (j)QR. (j)OL and (j)QR are not known to 
be important for gene expression and have been described 
as replication promoters. Both (j)OL and <J>QR have been 
reported to initiate replication in mutants lacking the 
primary origin (59, 215, 288). 

RNAPs of different members of theT7 family may recog¬ 
nize different promoters, and six specificities are currently 
known. As more distant relatives of the family are character¬ 
ized, this number will likely increase. There is surprisingly 
little cross-talk between RNAP and heterologous promot¬ 
ers even though a single amino acid change in T7 RNAP is 
sufficient to switch specificity to that of a T3 promoter (218, 
219, 226). Similarly, a single change at —11 in the 23 bp 
promoter is sufficient to cause a specificity switch (131); two 
changes in theT7 promoter also allow recognition by phage 
SP6 RNAP (152). A comprehensive study on SP6 promoter 
variants has been completed (249). 

The activity of T7 RNAP is modulated by gp3.5 lysozyme. 
Mutants that exhibit altered susceptibility to lysozyme have 
been characterized (186, 328). It is not obvious from the 
crystal structure of the RNAP-lysozyme complex (114) how 
all the mutations, which are scattered across the RNAP gene 


and alter widely spaced residues, exert their effect. However, 
the large conformational changes that occur in the enzyme 
during the transcription cycle are not captured in the struc¬ 
ture of the RNAP-lysozyme complex. Lysozyme is responsi¬ 
ble for the switch from class II to class III gene expression 
(109, 174, 175, 186) during infection. Lysozyme destabilizes 
initial transcription complexes (104, 139, 300, 329), and 
as T7 class II promoters are inherently weaker than their 
class III counterparts the former are preferentially affected. 
Although remaining bound to RNAP, lysozyme has no effect 
on an elongation complex. However, transcription termina¬ 
tion may be affected. 

T7 RNAP is relatively insensitive to E. coli RNAP termina¬ 
tion signals, although some, including rrnB T1 and T2, and 
X tR', are moderately efficient (34, 91,113,158,159, 161, 162, 
329). Notably, TE, which terminates transcription by E. coli 
RNAP at the end of the early region, has no effect on T7 
RNAP. Two T7 RNAP terminators exist in the phage 
genome. The major late terminator, T(j>, is located distal to 
gene 10 and consists of a long stem-loop followed by a 
string of six U residues. T7 T<f> is not totally efficient, and 
read-through is actually required as there is no promoter 
between the terminator and the essential genes 11 and 12. 
Termination efficiency at T(j) is not affected by lysozyme 
(329). In contrast, lysozyme is absolutely required at the 
second terminator, CJ, named for its location near the con- 
catemer junction. On the normal genetic map of T7 CJ lies 
immediately downstream of the left terminal repeat. CJ has 
the sequence 5'-ATCTGTT that has no obvious secondary 
structure and is not followed by a string of U residues. In 
the absence of lysozyme CJ acts primarily as a pause site 
but does not cause termination; the presence of lysozyme 
greatly enhances pausing, which results in substantial 
levels of termination (158,159, 329). Termination is thought 
to be due to the paused elongation complex returning to an 
unstable initiation complex (259), which is then destabi¬ 
lized by lysozyme (104, 300). Pausing or termination of tran¬ 
scription at CJ reflects the role of lysozyme in stimulating 
DNA replication and packaging of DNA during infection 
(329). The CJ heptamer sequence is conserved in the closest 
relatives of T7 but not in phages more distantly related. 

Late Gene Expression 

Class II genes (1.1-6.3) are expressed from about 6 minutes 
through 15 minutes of infection at 30°C. The first three 
genes, 1.1-1.3, in the class II region are thus transcribed 
early by E. coli and late by T7 RNAP. Ten promoters direct 
expression of T7 class II genes, but this organization is not 
conserved in all T 7-like phages. Most class II (and some 
class III) RNAs terminate at TcJ>, providing overlapping tran¬ 
scripts that have different 5' ends. Thus, genes lying at the 
distal end of the class II region are transcribed more 
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frequently than those near the beginning. Relative to pro¬ 
moters in theT7 class ILI region, class LI promoters are weak 
but this hierarchy is not maintained in all T 7-like phages. For 
example, the sequences of theT3 class II promoter <j>2.5 and 
the class III promoter (j)9 are identical (7). However, RNA 
concentrations may not be the major regulator of gene 
expression, and different translational efficiencies may 
account for most variations in the levels of class II proteins 
synthesized. For example, gp2.5 and gp3.5 are among the 
most actively synthesized proteins by T7, but gp2.8 and gp3, 
which are translated from the same set of overlapping RNAs, 
are not abundant proteins. Almost all T7 coding regions 
are preceded by “good” Shine-Dalgarno sequences; transla¬ 
tional coupling (225, 232) and the use of GCU as the second 
codon for highly expressed genes (62, 83) are two factors 
governing levels of proteins made during infection. 

Transcription of class II genes is controlled by the class II 
protein lysozyme (gp3.5) by a feedback mechanism. Gene 3.5 
mutants show extended transcription of class II genes, and 
reduced levels of both DNA replication and DNA packaging. 
Reduced replication could be due to high levels of tran¬ 
scription from the (j)/.M and 4>2.2B promoters, which syn¬ 
thesize the initial primer RNAs, but this has not been 
demonstrated experimentally. Reduced DNA packaging is 
due to the failure of the CJ pause/terminator signal to func¬ 
tion in the absence of lysozyme. Although transcription of 
class II genes is shut off by lysozyme, the question remains 
why class II protein synthesis also stops. Competition for 
ribosomes between class II rnRNAs and the more abundant 
class III rnRNAs is again a possibility, but the functional 
stability of class II rnRNAs remains to be studied carefully. 

Class III promoter sequences are totally conserved; tran¬ 
scripts from (j)6.5, <j)9, and 4>20 inefficiently terminate at 
T(j); read-through RNAs and transcripts from <j)23, <f>27, and 
4>0R terminate at the genome end or, if concatemeric DNA is 
the substrate, perhaps at CJ. The gene 27 promoter cannot 
be of major importance as T3 <J>27 has T7 specificity and is 
not used during infection (228). In contrast to many other 
phages, in T7 late transcription is not coupled to DNA repli¬ 
cation. In fact, incorporation studies of r-phosphate into 
T7 RNAs (8) suggest that transcription may be largely over 
before much replication has occurred. Furthermore, late 
gene expression appears normal when DNA replication is 
totally prevented by mutation (270). 

Class II Proteins 

Gp2 binds to and is an inhibitor of E. coli RNAP (97-99,200), 
but in K-12 or B strains the defects due to the lack of gene 2 
activity are reduced phage DNA replication and premature 
breakdown of replicating phage DNA, specifically at the 
left genome end where the major E. coli promoters are found 
(55, 192, 313, 314). Breakdown of replicating T7 requires 
both gene 3 endonuclease and DNA packaging activities. 


The gene 2 protein has been purified both by its inhibitory 
activity (54, 99) and by its ability to stimulate DNA pack¬ 
aging in gene 2 mutant-infected cells (146). After early tran¬ 
scription has been completed a 2~ infection can be rescued 
by addition of rifampicin during infection (192, 204), and the 
drug also substitutes for gp2 in DNA packaging assays (146). 
Growth on E. coli B or K-12 tsnB strains requires that 
T7 carries a gene 2 mutation(s) although T3 + growth is 
normal (238). The tsnB mutation is now known to be rpoC- 
E1188K (200); the change in RpoC prevents inhibition by 
wild-type gp2 but not by theT7 gene 2 mutant (1 or T3 gp2. 
The ability of T7 gene 2 null mutants to grow in E. coli C does 
not, however, appear to be due to differences in RNAP as 
transductants of E. coli C containing RNAP subunits from 
E. coli K-12 strains remain permissive (G. E. Christie and 
I. J. Molineux, unpublished observations). An E. coli K-12 
mutant that is deleted for the region of RpoC that binds gp2 
is nonpermissive for T7 (63; W. P. Robins and I. J. Molineux, 
unpublished observations). T7 mutants selected to grow on 
this host include some that affect the early E. coli promoters, 
but they have not been analyzed in detail. 

The defects due to the inability of gp2 to inhibit E. coli 
RNAP occur late in infection. Host transcription is normally 
inactivated by T7 gp0.7 before DNA replication and pack¬ 
aging occur, and neither gp2 nor rifampicin can remove 
promoter-bound E. coli RNAP. It is unclear how E. coli 
RNAP interferes with the process of DNA maturation and 
packaging. Nevertheless, both in vitro (308) and in vivo (55) 
the absence of gp2 leads to selective degradation of the left 
end of the T7 genome. It is obvious that there is much yet 
to understand about the biological roles of gp2 and E. coli 
RNAP during T 7 infection. 

Gene 2.5 encodes a single-stranded DNA-binding protein 
(SSB), which is essential for T7 DNA replication and recom¬ 
bination (128). The structure of the protein has been deter¬ 
mined to 0.19 nm (103); it exists as a dimer that, unlike E. 
coli SSB or T4 gp32, binds to DNA with little or no coopera- 
tivity (129, 130). However, T7 SSB is much more effective 
at mediating homologous base-pairing than its E. coli or 
T4 counterparts. SSB interacts with T7 primase-helicase, 
mediating efficient strand exchange and stimulating pri- 
mase activity (133, 134). SSB also interacts with T7 DNAP 
and is therefore the primary organizer for coordinating 
leading- and lagging-strand synthesis (130,148,149). 

Genes 3 and 6 code, respectively, for an endonuclease 
that prefers ssDNA and a 5'—>-3' exonuclease; both enzymes 
are essential for recombination in vivo (150,151). Both pro¬ 
teins are also required for degradation of the host chromo¬ 
some during infection and their combined action provides 
the nucleotides for subsequent phage DNA replication. 
About 80% of the DNA in progeny phage is derived from the 
host chromosome. Both gp3 and gp6 can degrade T7 DNA 
in vitro; how their activities are controlled in vivo to 
prevent destruction of the infecting genome is unknown. 
The gene 3 endonuclease is also a Holliday structure 
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resolvase (50, 51,157) whose crystal structure is known (86). 
The gene 6 exonuclease also possesses an RNase H activity 
that removes RNA primers (250, 251) and promotes concate- 
mer formation during DNA replication (147). 

Gene 3.5 codes for lysozyme, which is actually not a lyso¬ 
zyme but an N-acetylmuramyl-L-alanine amidase (112). The 
most important functions of lysozyme are also not in cell 
lysis but in controlling initiation and termination by T7 
RNAP, which in turn affect DNA replication and packag¬ 
ing (329). These properties better reflect its position among 
the class II DNA metabolism functions rather than with the 
class III lysis-associated genes 17.5 and 18.5/18.7. Lysozyme 
is, however, important in releasing progeny phage from the 
debris of lysed cells (252, 327). The nucleic acid metabolism 
and amidase functions of lysozyme can be separated. The 
N-terminal residues of the protein are essential for interac¬ 
tion withT7 RNAP, removing residues 2-5 prevents binding, 
and causes the same defects in transcription, replication, 
and packaging as a complete gene deletion (33, 327). The 
deletion leaves amidase activity intact. In contrast, remov¬ 
ing residues 130-135 (of the 150 amino acids in the mature 
protein) inactivates amidase activity but allows a normal 
interaction with RNAP. Gene 3.5 amber mutants make 
small plaques at an efficiency of approximately 0.5 on non¬ 
suppressing hosts, whereas a 3.5 deletion mutant makes 
pinpoint plaques at an efficiency of approximately 0.2, 
although about half the normal amount of replication 
occurs and the burst size is about one third that of wild- 
type (327). Addition of T7 lysozyme (or detergent) to the 
lysed culture is necessary to complete release of progeny 
phage, which appear to be trapped. Although cell lysis by 
3.5 mutants is delayed and incomplete, it is not markedly 
different from that after infection by mutants blocking 
DNA replication (252, 269). The timing and normal abrupt¬ 
ness of lysis of T7-infected cells may be determined in part 
by DNA replication and/or packaging events. 

The full-length gene 4 protein is both a 5'— >3' helicase of 
the E. coli DnaB family and a DNA primase. Both enzyme 
activities have been the subject of recent reviews (69, 255). 
An in-frame internal start site within gene 4 results in 
a protein, gp4B, which lacks the N-terminal 63 amino acids 
of the 63 kDa gp4A and consequently lacks primase acti¬ 
vity (10, 11). Both proteins are made in infected cells; the 
native form of the enzyme appears to be a heterohexa- 
meric ring but each protein alone makes a similar structure 
(64, 203). The enzyme also forms heptameric rings (291). 
Single-stranded DNA passes through the central hole of 
the ring (322). Crystal structures of the separate primase 
and helicase domains, and of the intact protein, have been 
reported (116, 254, 291). Both the crystal structure of the 
intact protein and single-particle electron microscopic 
image reconstructions (299) show that the primase active 
site lies on the outside of the hexameric ring. Although 
most closely related members of the T7 family make two 
gene 4 proteins, the phage SP6 primase gene lacks an 


internal start; furthermore, helicase activity of the protein 
was not demonstrated (292). 

The N-terminal region of gp4A contains a Cys4 zinc 
finger motif that makes sequence-specific interactions with 
a template for RNA primer synthesis (67). The primase recog¬ 
nition sites on single-stranded DNA are 5'-G/TGGTC, and 
5'-GTGTC; the 3' base is necessary for recognition but is 
not copied into primers, which thus have the sequence 
5'-ACCA/C and 5'-ACAC (68, 70, 178, 286). Helicase hydro¬ 
lyzes most NTPs, with dTTP being most efficient, and hydro¬ 
lysis is stimulated by single-stranded DNA (169, 170, 209, 
305). Hydrolysis of dTTP by the hexameric helicase has 
been likened (101) to the binding change mechanism of the 
rotary Fj-Fq ATPase (202). The rate of translocation of heli¬ 
case along single-stranded DNA has been measured at 18°C 
to be 132 bases per second (125), comparable to the rate of 
in vitro DNA replication. 

T7 DNAP consists of two proteins: the gene 5 protein 
and the processivity factor E. coli thioredoxin. Thioredoxin 
forms a 1:1 complex with the T7 protein and increases its 
processivity by at least 1000-fold. The complex also increa¬ 
ses the activity of the associated 5'— >3' double-stranded 
DNA exonuclease but has little effect on single-stranded 
DNA exonucleolytic activity (107, 285). The redox function 
of thioredoxin is not important in complex formation (106), 
rather the protein plays the equivalent function to that of 
sliding clamps in other replication systems. A crystal struc¬ 
ture of T7 DNAP bound to a primer-template and with a 
dNTP in the active site has been obtained at 0.22 nm resolu¬ 
tion (58). Thioredoxin, and the loop on gp5 to which it binds, 
may encircle the primer-template DNA exiting the polymer¬ 
ase. T7 DNAP, like the RNAP, is a member of the DNA poly¬ 
merase I family with palm, fingers, and thumb domains. 
Thioredoxin binds to an extended 71 residue loop in the 
thumb domain; interestingly, this loop is not present in E. 
coli Pol I. Pol I synthesizes DNA distributively but engineer¬ 
ing the thioredoxin-binding sequences into the appropriate 
location confers thioredoxin-dependent processivity to the 
enzyme (9). 

Gene 5 protein, thioredoxin, gp4 primase-helicase, and 
gp2.5 SSB together catalyze simultaneous and coordinate 
replication from a synthetic minicircle double-stranded 
DNA template with two primase recognition sites and a 
single-stranded 5' tail (148,149). This cleverly designed reac¬ 
tion synthesizes DNA at approximately 300 bases per second 
per polymerase molecule at 30° C, close to the rate observed 
in vivo. When sufficient SSB is provided, both leading and 
lagging strands are synthesized at the same rate, and more 
than30kb duplex DNA molecules are produced. 

Replication In Vivo 

T7 DNA replication is independent of host DNA replica¬ 
tion, repair, and recombination functions. Although eroding 
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one strand of each terminal repeat allows formation of 
circular T7 molecules in vitro, such molecules are not 
found in vivo. Nevertheless,T 7 replication is extremely sensi¬ 
tive to the topoisomerase inhibitor nalidixic acid (136). 
By trapping covalently bound topoisomerase on DNA, nali¬ 
dixic acid inhibits T7 DNA replication. However, growth 
of T7 is normal in gyrACYs) mutants at their nonpermis- 
sive temperature. Similarly, normal T7 growth occurs in 
gyrA(Ts), gyrB(Ts), gyrA(Ts)gyrB(Ts) double mutants, par 
C(Ts), and parD(Ts) mutants at high temperatures (P. Kemp 
and I. J. Molineux, unpublished observations). Further¬ 
more, at the non-permissive temperature for DNA gyrase or 
topoisomerase IV T 7 growth is fully resistant to quino- 
lone antibiotics (P. Kemp and I. J. Molineux, unpublished 
observations: 136). 

The primary origin of replication is an AT-rich region 
lying immediately downstream of the <j>l.lA and fyl.lB 
promoters. Both are used by T7 RNAP to initiate replication 
on the leading strand. In vitro, primers of 10 to 60 bases 
originate from both promoters (76, 77), whereas in vivo the 
transition from RNA to DNA occurs at many sites over a few 
hundred base pairs (281). In the absence of T7 RNAP, no 
DNA replication occurs, but as RNAP is also necessary for 
late gene expression few, if any, replication proteins are 
made in its absence. Replication of DNA absolutely requires 
the presence of genes 1,2.5, 5 (and E. coli trxA), and gene 4A. 
Mutants lacking the helicase-only gene 4B grow normally 
(177, 230). In vitro, gp2.5 has been shown to be necessary 
for bidirectional replication (77). Mutants lacking genes 3 or 
6 show a premature replication shutoff; the host chromo¬ 
some is not degraded and the nucleotide precursors required 
for replication are therefore not produced. In optAl cells, 
which overexpress a dGTPase, the lack of gene 1.2, which 
inhibits this host enzyme, also leads to a deficit of nucleo¬ 
tide precursors and the consequent premature inhibition of 
replication (5,105,311). T 7 DNA is degraded by genes 3 and 6 
in this abortive infection (233). 

T7 replication is bidirectional, but the overall movement 
of the left fork is delayed relative to the right fork. Thus the 
origin of replication determined by electron microscopy 
(where the midpoint of a replication bubble is measured) 
is displaced from the true origin. Deletion mutants that lack 
ori grow well, perhaps even better than wild-type under 
laboratory conditions. Thymidine-labeling of newly synthe¬ 
sized DNA suggested that replication in these mutants initi¬ 
ates from secondary origins associated with the (j>6.5, 4>13, 
and 4>OR promoters (215). Replication from (j)OL was not 
detected in this study. In contrast, density fractionation of 
light T7 DNA that was replicating in heavy ( 15 N, 2 H) media, 
followed by electron microscopic analysis, showed the most 
pronounced secondary origin to be near the left genome end 
(59, 288). This origin is possibly associated with (|>OL. The 
difference in the two studies may reflect the different 
growth conditions employed, which in turn suggests that 
different replication origins may be used preferentially by 


T7 as it grows in hosts that are in different physiological 
states. 

Plasmids containing 4)OR (and certain other promoters) 
are replicated extensively in T7-infected cells (36, 37, 214) 
and can inhibit phage growth (20, 36). Plasmids that contain 
both 4>OR and the terminal repeat TR region are not only 
replicated but also packaged to form transducing particles 
(36, 37). Plasmids containing the comparable regions of 
T3 respond similarly after T 3 infection (293). 

T7 DNA is linear: consequently, the first round of repli¬ 
cation produces two incomplete duplex molecules with 3' 
single-stranded extensions. These extensions are part of the 
terminal repeats and can thus anneal to form a linear conca- 
temer; the latter thus contains a single copy of TR between 
two genomes. Concatemer formation is known to require 
the gene 6 exonuclease in vivo (147) and may also require 
SSB to promote strand annealing. Linear concatemers 
containing up to about 10 genome equivalents are found 
early after infection but then a fast-sedimenting complex 
appears that contains >100 genome equivalents (205). This 
complex is not torsionally constrained, it contains many 
single-stranded regions, and it consists of genomes in the 
process of both recombination and replication (142, 143). 
T7 recombination is very efficient. Recombination inter¬ 
mediates are likely converted into replication forks, similar 
to what is found inT4 DNA replication. Consistent with this 
idea, the bulk of progeny T7 DNA is synthesized within 
about 5 minutes at 30° C (the more extended period seen 
by pulse-labeling with thymidine reflects a reduction in 
nucleotide pool size, and a time-course of thymidine incor¬ 
poration is therefore not an accurate indicator of DNA 
synthesis rates throughout infection). The complex DNA 
structure is eventually resolved by the Holliday structure¬ 
resolving gp3 endonuclease back to linear concatemers 
(51,157), which are the substrates for DNA packaging. 

Class III Proteins 

Class III proteins are all involved in virion assembly, DNA 
packaging, or cell lysis. The latter process has not been 
studied in detail. In contrast to most phages, T 7 lysozyme is 
not required for lysis as measured by a decrease in culture 
turbidity, although it is important for releasing progeny 
from cell debris. Lysis is delayed after infection by a gene 

5.5 null mutant but no more so than by any mutant defective 
in DNA replication (269, 270, 327), and lysis may actually 
be triggered by some aspect of DNA replication or packag¬ 
ing. Gene 17.5 codes for a typical type II holin (302) but 

17.5 amber mutants of both T7 and T3 still lyse, albeit 
less synchronously and with a delay relative to wild-type. 
Whether the amber mutations employed were leaky or 
whether T7 has an alternative means of disrupting the 
cell membrane is not known. Note that T7 does not contain 
a typical lysis “cassette” of adjacent holin and lysozyme 
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functions. In addition, genes 18.5 and 18.7 are separated 
from holin by the small terminase subunit. Genes 18.5 and 
18.7 are homologs of X Rz and Rzl, and were shown to 
complement P22 gene 15 mutants for lysis in the presence 
of divalent ions (21). T7 18.5 or 18.7 mutants have not been 
tested for a lysis defect. Interestingly, although close relatives 
have the same genetic structure as T7, the P. aeruginosa 
phage 4>KMV genome has been annotated with the more 
traditional lysis cassette (144). 

Virion Assembly 

The head assembly pathways of T7, T3, <J>II, and probably 
most close relatives of T7 are the same (246). Proheads con¬ 
tain the head-tail connector gp8, scaffolding protein gp9, 
capsid protein gplO, and the three internal core proteins 
gpl4, gpl5, and gpl6 (24, 26, 223, 241, 244). Early reports on 
the functions of genes 7 and 13 in assembly should be lar¬ 
gely discounted. Gene 7 was originally defined by mutations 
that mapped between genes 6 and 8 (269) but was sub¬ 
sequently shown to be two genes, 7 and 7.3 (62). Both are 
now known to be mature virion proteins, but gp7 is nones¬ 
sential whereas gp7.3 is a tail protein that is essential for 
infectivity although not for particle formation (P. Kemp and 
I. J. Molineux, unpublished observations). 

In the absence of gp9 or gplO, no capsid-like struc¬ 
tures form (223, 244), whereas in the absence either of the 
connector or the three internal core proteins aberrant poly¬ 
capsids appear (24, 223, 241, 242, 261). All six proteins are 
required for prohead formation in vivo but the order of 
their assembly remains unclear. The dodecameric connector 
and the three core proteins may assemble and serve as an 
initiator for scaffold and capsid protein assembly; alterna¬ 
tively, the scaffold and capsid protein may assemble to form 
incomplete prohead shells, which are then closed by inser¬ 
tion of a preformed connector-core complex (24). After DNA 
packaging the tail proteins cooperatively assemble on the 
filled head. No tail structures independent of a head are 
found (171,277). 

DNA Packaging 

T7 packaging has been examined in vitro and in vivo, 
although studies on T3 have been more comprehensive. 
A defined in vitro system, consisting of purified proheads 
and the terminase proteins gpl8 and gpl9, packages unit- 
length DNA. ATP is used as energy source, both for 
DNA translocation and DNA condensation inside the pro¬ 
head (88). 

A crude lysate of T3-infected cells was shown to pack¬ 
age T3 DNA in vitro (72) and to discriminate between T7 
and T3 DNA substrates. Unit-length genomic DNA was 
active in this assay but was converted into concatemers 


during the reaction (75). Discrimination is conferred primar¬ 
ily by the specificity of the phage RNAP for the promoter 
<j)OR, for downstream DNA, and by sequences at the ter¬ 
minal repeat TR necessary for the initial cleavage of con- 
catemeric DNA (92, 93, 318). Comparable bipartite DNA 
sequences required for packaging are found in T7 (37, 38). 
A more purified in vitro packaging system, consisting 
of proheads and the gpl8 and gpl9 terminase, packages any 
linear double-stranded DNA without prior conversion to 
concatemers (88). 

The initial events of packaging T3 DNA are comparable 
to those of other double-stranded DNA phages (73, chapter 
6). DNA is recognized by the small terminase subunit gpl8, 
while the large subunit gpl9 binds the prohead. There is 
genetic evidence for a role of the connector protein in T3 
DNA packaging (71, 74,199) and theT7 connector has also 
been shown to bind DNA (25). Using ATP as an allosteric 
effector, a 50S ternary complex forms between the T3 
prohead, DNA, and terminase; packaging is then initiated 
by ATP hydrolysis (247, 248). The large terminase subunit 
contains the ATP hydrolysis, the DNA translocation, and 
both the specific and nonspecific dsDNA endonucleolytic 
activities (193, 194, 195). Specific cleavage is not required 
for packaging unit-length DNA but is of course necessary 
to make a mature genome from concatemeric DNA. ATP 
hydrolysis is coupled to DNA translocation into the prohead. 
Each ATP molecules packages approximately 1.8 bp at an 
average rate of 22 kb per minute at 30°C (193, 248). T7 DNA 
is packaged cooperatively from concatemeric DNA and at a 
comparable rate to that of T3 DNA in vitro (256, 283, 284). 

“Solutions” to the problem of completely replicating 
the linear T7 DNA molecule have existed for over 30 years 
(117, 307). These models have long been known to be incor¬ 
rect in detail. How the single copy terminal repeat (TR) 
between genomes in the concatemer junction is duplicated 
in producing mature progeny genomes is still not fully 
understood. The failure to duplicate the TR necessarily 
results in only half the replicated DNA being packaged into 
viable progeny particles. Although this would seem to be 
a very inefficient use of genetic material, it should be 
noted that normal lysates do contain substantial amounts 
of unpackaged phage DNA. Nevertheless, it is generally 
supposed that T7 does have a mechanism to duplicate the 
TR during maturation and packaging. In vitro studies 
suggest that the two mature ends of T7 DNA are formed 
in uncoupled reactions that involve double-stranded DNA 
cleavages (308). The initial double-stranded DNA break is 
made at the right end of the concatemer junction (corre¬ 
sponding to the genome right end) by the gpl8-gpl9 termi¬ 
nase. In vivo the mature right end is known to be formed 
prior to the mature left end (245), but how the genome 
left end is formed is less clear. A covalently closed hairpin 
containing the TR is found during T7 infection (38), and 
is also found in cells replicating plasmids that contain 
the concatemer junction (36, 37). Sequencing the hairpin 
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Figure 20-3 Proposed mechanisms for TR duplication. Adapted from Fujisawa and Morita (73) and Chung and Hinkle (37). 


structure DNA suggested the model shown in figure 20-3B 
for duplication of the terminal repeat (38). The nuclease that 
nicks the cruciform to generate the hairpin is known not 
to be the gene 3 product, but has not yet been identified. 
It was suggested that it may be the gene 19.5 product, but 
the latter is not essential for the formation of transducing 
particles (293), and neither 19.5 nor the cruciform region 
are essential for phage growth (126, 127). The most 
pronounced defect due to loss of 19.5 is a delayed breakdown 
of host DNA and a reduced burst. Deleting the cruciform 
DNA causes a lysis delay. 

As originally described, the hairpin model does not 
account for the role of transcription termination at the CJ 
pause/terminator signal found immediately downstream 
of TR. The failure of a T7 RNAP mutant that is transcrip¬ 
tionally competent but pause/termination-defective at CJ 
to support DNA packaging can be suppressed by a gene 19 
mutation (158, 328). This observation led to the idea that 
the paused or terminating RNAP at CJ allows the nontem¬ 
plate strand to be nicked by terminase. Strand displace¬ 
ment synthesis then allows duplication of the TR sequences 
(73, figure 20-3 A). However, this model does not explain 
the existence of hairpin DNA, which is thought to be an 
intermediate in formation of the mature left genome end. 
Nor does it satisfy the biochemical observation that 
terminase causes a double-strand break. The difficulty in 


establishing the mechanism of TR duplication suggests 
either that more than one efficient pathway exists or perhaps 
that TR duplication is not necessary for a substantial burst of 
progeny phage. 

Exclusion of T7 by the F Plasmid 

T7 growth is aborted in cells containing the F plasmid 
(164). The only F function required is pifA (46, 47, 235, 236), 
a gene that is nonessential for conjugation or plasmid main¬ 
tenance. A similar gene is found in a few other conjugative 
plasmids, including R56 and R64 (108,180, 306). Proteins 
closely related to PifA are also predicted in a pathogeni¬ 
city island of a uropathogenic E. coli strain and in a cryptic 
prophage of S. typhimurium. More distant relatives are 
found in both Gram-negative and Gram-positive organisms, 
and patches of the PifA amino acid sequence show simi¬ 
larity to a number of ABC transporters and other ATP- 
binding proteins. Attempts to demonstrate ATP binding to 
F PifA have not been successful. Other than inhibi¬ 
ting growth of T7 and related phages, PifA has no defined 
function. 

The pif operon contains only pifA and an autoregulated 
repressor pifC (46, 47, 181,182). Early reports of a pifB gene 
are incorrect. PifA is a membrane-associated protein but 
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has not been shown to cross the cytoplasmic membrane (32). 
PifA interferes with the normal function of both gene 1.2 
and gene 10, and the interaction of PifA and either phage 
protein results in the inhibition of all macromolecu- 
lar synthesis in infected cells. Inhibition requires the accu¬ 
mulation of T7 gpl.2 or gplO but the slow penetration of T7 
DNA into the infected cell means that the infection is aborted 
before the complete genomes of about half the infecting 
phages have entered the cell (80). Early T7 gene expression 
is therefore normal in cells containing F, but expression of 
genes closer to the right end of the genome is severely 
affected (8,12,13, 40,196,197). Exclusion of T7 also results 
in the loss of membrane integrity. Membrane transport 
processes are inhibited, and intracellular ATP and other 
phosphorylated molecules leak from the abortively infected 
cell (41, 234, 235, 236, 237). A major problem in understand¬ 
ing the process of exclusion is in distinguishing the initial 
event that then leads to a plethora of physiological dysfunc¬ 
tions (188). 

Mutants of T7 that grow in cells containing F have been 
isolated (190). They necessarily contain a missense or null 
mutation in gene 1.2 and two missense mutations in gene 
10. Although T3 + grows normally in F cells,T3 1.2 mutants 
behave like wild-type T7 and are excluded (191). In the 
1.2 mutant background two missense mutations inT3 gene 
10 are required to restore growth (42). T3 gene 1.2 normally 
therefore protects the PifA-containing cell from expression 
of gene 10. T 7 gene 1.2 is not only unable to perform this 
function but itself causes the infection to abort. 

The physiological defects associated with F exclusion of 
T7 can be mimicked by using cloned genes in the absence 
of phage infection. Any plasmid expressing pifA is lethal to 
a cell that also expresses a cloned gene 1.2 or gene 10. Leth¬ 
ality is completely suppressed when the cloned gene 1.2 
contains a missense mutation first recognized in mutant 
phages that grow in F cells, but is only reduced with com¬ 
parable mutant gene 10 plasmids (237). Several indepen¬ 
dent selections for E. coli mutants that tolerate the 
combination of pifA and gene 1.2 or pifA and gene 10 resulted 
in the isolation of a single mutant (303). The mutation is 
a promoter-up that increases expression of fxsA about 
25-fold. A disruption of fxsA still leads to F exclusion of 
T7, thus FxsA is not the cellular target of PifA and either 
of the T7 proteins. Overexpression of fxsA therefore pro¬ 
vides a protective function that alleviates most of the 
physiological defects associated with F exclusion of T7 and 
allows a normal phage burst (304). fxsA overexpression 
seems to affect the mechanism of exclusion directly, which 
is different from most other chromosomal mutations (e.g., 
strA, rif rho, gyrA, galU, himA) that affect the exclusion 
phenotype by affecting expression levels of pifA or overall 
cell physiology (121, 236). 

FxsA is a membrane protein of unknown function with 
four transmembrane segments (32, 304) and a C-terminal 
cytoplasmic tail. This tail is not important for overcoming 


exclusion, but the fourth transmembrane segment is essen¬ 
tial (32). The membrane association of PifA and the protec¬ 
tion from lethality by the integral membrane protein FxsA 
strongly suggest that the cell membrane is the site of the 
initial events in the exclusion process. 

PifA interacts with gpl.2 and with gplO in vitro, and all 
three interact with FxsA, but the significance of these inter¬ 
actions, which are likely also to occur in vivo (32), remains 
unclear. By affinity methods, no soluble E. coli protein was 
found to interact with PifA or T7 gplO, but T7 gpl.2 bound 
E. coli dGTPase and PNPase. The former interaction reflects 
only the inhibition of dGTPase activity by gpl.2 (5,105,198) 
and not F exclusion (188). The significance of the interaction 
between gpl.2 and PNPase is unclear. Neither polymeriza¬ 
tion nor RNA degradation by PNPase is affected by gpl.2, 
and levels of PifA, T 7 gpl.2, and gplO are unaltered by a pnp 
null mutation. However, plating of T7 on E. coli pnp mutants 
containing F is increased by several orders of magnitude, 
relative to plating on F-plasmid-containing E. coli wild- 
type (32). 
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B acteriophage N4 is a lytic phage specific for Escherichia 
coli K-12 strains originally isolated from the sewers of 
Genoa (80). N4 is unique in: (i) the use of three different 
DNA-dependent RNA polymerases during its growth cycle, 
(ii) a virion-encapsidated RNA polymerase (N4 vRNAP) 
that is injected into the host cell upon infection, (iii) the use 
of single-stranded DNA binding proteins as transcriptional 
activators, (iv) the presence of 3' extensions at each end of 
its linear genome, and (v) a lysis-inhibited infection cycle. 

The Virion 

The N4 virion particle, as visualized by electron micros¬ 
copy, consists of an icosahedral head 70 nm in diameter 
connected by a base plate to a small noncontractile tail, and 
a number of short tail fibers originating from the junction 
between the head and the tail (figure 21-1A) (82). The virion 
particle is composed of a single linear double-stranded DNA 
molecule and, as determined by sodium dodecyl sulfate- 
polyacrylamide gel electrophoresis, 10 proteins (25). The 
major component of the virion is the coat protein, a 48 kDa 
polypeptide. Also present is a virion-encapsidated phage- 
coded, DNA-dependent RNA polymerase (vRNAP) that is 
responsible for transcription of phage early genes (24). 

The Genome 

The genome, which has been recently sequenced, is a linear, 
double-stranded DNA molecule 70.6 kbp in length with a G-C 
content of 41.3 moles % (R. Hendrix, personal communica¬ 
tion). Direct repeats varying in length from 390 to 440 bp 
and 3' extensions are present at the ends of the genome 
(58). The left end is unique with microheterogeneity at the 
3'-terminus yielding primarily the 5-or 6-base 3'protruding 
sequences 3'-CATAA or 3'-CATAAA (figure 21-2A). In con¬ 
trast, one major and at least three minor discrete families of 
the right and exist, differing in length by 10 bp and giving 
rise to the variability in the length of the terminal repeats. 


Each of these families of ends has a microheterogeneity 
of length with 1- to 3-base 3' extensions (figure 21-2B) (58). 
The N4 genome is resistant to cleavage by a wide range of 
restriction endonucieases (58,90). 

The genome encodes three tRNAs and 72 open read¬ 
ing frames (ORFs), 51 of which show no statistically 
significant similarity to protein sequences in the database. 
Five early genes are located in two clusters at the left end of 
the genome and encode, among others, a protein implicated 
in genome injection (0RF1; A. Demidenko, unpublished 
data), and proteins that comprise the N4 middle transcrip¬ 
tional machinery (0RF2, 0RF15, and ORF16: see below). 
Genes transcribed by the middle transcriptional machinery 
make up the remainder of the left half of the genome. 
Eleven middle ORFs encoding small proteins (ORFs 4-14) 
are located between the early gene clusters. Surprisingly, a 
32 kDa virion protein (0RF17) of unknown function is 
transcribed during the middle period. ORF18 is homologous, 
on both the sequence and functional level, to bacterio¬ 
phage P22 gpl7, which enables P22 to successfully infect 
Salmonella strains containing the Fels-2 prophage (65: 
N. Federova, unpublished data). Five middle ORFs (ORFs 
19-21, 46, and 48) are similar to ORFs encoded by the 
Salmonella typhi HMC 2 plasmid (59). ORFs 24 and 25 are 
similar to adjacent ORFs found in the Streptomyces coelicolor 
A3, Thermotoga maritima, and Deinococcus radiodurans 
genomes. ORFs 33 and 34 encode proteins with strong 
sequence similarity to putative domains of T4 RIIA and 
RUB. N4 is currently the only phage unrelated to the T4 
phage family known to encode RH-like proteins. Middle 
gene products also include four proteins required for N4 
DNA replication (that will be discussed later) and ORFs 
with sequence similarity to dCTP deaminase/dUTPase 
(ORF26) and Thy 1, thymidylate synthase complementing 
the protein (ORF 30). Two middle genes of unknown func¬ 
tion are located at the right end of the genome (M. Hammer, 
unpublished data). Late ORFs are located in the right half 
of the genome. These include genes for the vRNAP and 
other virion proteins, as well as a putative “lysis cassette.” 
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Figure 21-1 Phage N4 virions and infected bacteria. 

A: Electron micrograph of bacteriophage N4 virions stained 
with 2% phosphotungstic acid. Enlargement: x 175,000. 
Courtesy of M. Ohtsuki, University of Chicago. B: Phase- 
contrast microscopy of uninfected cells (left) and cells 
2 hours after N4 infection (right). Courtesy of Dr. G. C. 
Schito, Institute of Microbiology, University of Genoa. 

C: Electron micrograph of an ultrathin section of E. coli K12S 
4 hours after N4 infection. Enlargement: x44,000. Courtesy 
of Dr. G. C. Schito, Institute of Microbiology, University of 
Genoa. 

N4 Adsorption 

N4 infection is initiated by phage attachment to at most 
five sites per bacterium. E. coli mutants to which N4 cannot 
absorb were isolated. The mutations map to four loci named 
nfrA, nfrB, nfrC, and nfrD (43). The nfn\ and nfrB genes are 
located at 12.7 min on the E. coli linkage map. The nfrB 

A T 

5 ' TGGTGGGG- 

3 ' c a ataaaccacccc- 


gene lies upstream of the nfrA gene and the two overlap 
by 14 nucleotides. The nfrA gene encodes a 122 kDa outer 
membrane protein, while the nfrB gene encodes an 85 kDa 
inner membrane protein with three potential membrane- 
spanning regions (43, 44). Two ORFs with sequence similar¬ 
ity and similar organization to the nfrA and nfrB genes are 
also found in the Ralstonia solanacearum genome, though 
their functions are unknown (GenBank CAD 15982 and 
17659, respectively). A plasmid expressing the NfrA and 
NfrB proteins confers sensitivity to N4 infection to E. coli 
B, which is normally resistant. The nfrC gene, located at 
85.5 min within a cluster of genes involved in the syn¬ 
thesis of enterobacterial common antigen (ECA), is identical 
to rffE/wecB, and encodes a 42 kDa cytoplasmic protein, 
UDP-GlcNac 2-epimerase (42, 74). Mutations in nfrC affect 
neither the expression level or export of the NfrA protein, 
nor the expression or localization of NfrB in maxicells. The 
identity of nfrD, which maps to approximately 54.2 min, 
is unknown. It has been reported that pel mutants, which 
lack the II-PMan or II-Mman components of the mannose 
permease system required for X phage DNA injection 
(20, 21), do not support N4 growth. N4 may interact with 
these proteins: however, N4 adsorption was not measured 
in these mutant strains (45). We have proposed that NfrA 
is the N4 receptor and that NrfB might provide a channel 
for injection of the genome and vRNAP (43). Since nfrC 
mutations do not affect the localization or amount of NfrA, 
we suspect that they might affect access to NfrA as a result 
of changes in ECA structure. 

N4 Growth Cycle 

Following phage adsorption, the genome and N4 vRNAP 
are injected into the host cell. Little is known about the 
mechanism of injection of vRNAP: the possible role of 
the N- or C-terminal domains of the vRNAP polypeptide in 
this process is under study (39). The process of genome 
injection is complex. The injection of the first 1000 bp of 
the genome, which occurs in the absence of post-infection 
protein synthesis, requires the N-terminal domain of the 
vRNAP polypeptide (A. Demidenko, unpublished data). 
Genome injection beyond the first 1000 bp requires the 
product of ORFI and vRNAP transcription from the three 
early promoters (see below). Host replication stops 3 minutes 


B 

5 1 -tcagaaacaaatcttag\cttctT3Caaagatgaac t aa^\acaaagaattg\'tctct 3’ 

3 1 -AGTCTTTGTTTAGAATCTGAAGACGTTTCTACTTGTTTTGTTTCTTAACAAGAGA 5 1 

Figure 21-2 Structure of the ends of the N4 genome. A: Left end. B: Four families of sequences found at the right end; 
C predominates. Arrows mark the ends of the 3' extensions. 




304 PART IV: INDIVIDUAL TAILED PHAGES 


after infection: an as yet unidentified early gene product 
must be responsible since host replication is not inhibited 
upon infection in the presence of chloramphenicol (72). 
The host genome is not degraded, and host messenger RNA 
and protein syntheses remain unaffected except for trans¬ 
cription of cAMP-dependent operons, which are shut off (72). 

At least three products of early transcription are 
required for the synthesis of N4 middle transcripts, which 
begin to appear 3 minutes after infection. This transcrip¬ 
tional machinery is responsible for injection of the last 
20kbp of the N4 genome (A. Demidenko, unpublished 
data). Middle genes encode, in part, proteins required for N4 
DNA replication. N4 DNA synthesis begins approximately 
10-12 minutes after infection (73). 

Late RNA synthesis, which begins 15 minutes after 
infection, is carried out by the E. coli a 70 -RNA polymerase 
(92) activated by the N4 single-stranded DNA binding 
protein (N4 SSB) (8). The first mature progeny are observed 
approximately 30 minutes post-infection and become loca¬ 
lized to the cell poles (figure 21-1B) (79). Because N4 does 
not actively lyse the host cell, infected cells continue to 
grow, becoming enlarged and filled with a paracrystalline 
array of phage particles (figure 21-1C) (78). A yield of up to 
3000 N4 particles per infected bacterium is obtained over 
a 3 hour period (78,80). 


N4 Transcription 

The N4 genome is transcribed in three temporal stages— 
early, middle, and late— by three different DNA-dependent 
RNA polymerases (figure 21-3) (92). Both early and middle 
RNAs are transcribed with rightward polarity through the 
left half of the genome, while late transcription occurs 
through the right half of the genome with leftward polarity. 
Details of each of the three transcriptional stages follow. 

Early Transcription 

All known bacterial DNA viruses, except N4, use the host 
RNA polymerase to transcribe their early genes (67). N4 
virions contain one or two copies of a phage-encoded RNA 
polymerase (vRNAP), which is injected with the phage 
genome upon infection and transcribes phage early genes 
during the first 5 minutes of infection (24). vRNAP activity 
in gently lysed cells is found to sediment with the DNA- 
membrane complex in sucrose gradients, making purifica¬ 
tion from infected cells difficult (22, 23). Therefore, vRNAP 
has been purified to homogeneity from virions and charac¬ 
terized (25). 

Contrary to a previous report (61), the rifampicin- and 
streptolydigin-resistant vRNAP is a single polypeptide of 
estimated 320,000 molecular weight (25). Sequencing of the 
vRNAP gene revealed an ORF for a 3500 amino acid (aa) 
polypeptide that lacks extensive sequence similarity to 
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Figure 21-3 Transcriptional program of the N4 genome. 
White blocks, early genes; gray blocks, middle genes; 
black block, late genes. The enzyme responsible for each 
transcriptional stage is indicated. 


either of the two known families of DNA-dependent RNA 
polymerases (39). However, vRNAP contains four short 
motifs (TxxGR, A, B, and C) characteristic of the family of 
T 7-like single-subunit RNA polymerases. However, vRNAP 
contains four short motifs (TxxGR, A, B, and C) characteris¬ 
tic of the family of T 7-like single-subunit RNA polymerases, 
which includes phge-encoded, mitochondrial and some 
chloroplast nuclear-encoded and linear plasmid enzymes 
(figure 21-4) (5, 39). To determine whether a smaller tran¬ 
scriptionally active domain exists within the polypeptide, 
vRNAP purified from virions was subjected to controlled 
trypsin proteolysis followed by a catalytic autolabeling 
assay (29, 33). Indeed, a stable and transcriptionally active 
1106 aa long domain (mini-vRNAP), which possesses the 
same initiation, elongation, termination, and product displa¬ 
cement properties as full-length vRNAP, is located at the 
center of the vRNAP polypeptide (figure 21-4) (39). Muta¬ 
tional, biochemical, and phylogenetic analyses indicate 
that N4 mini-vRNAP is a highly evolutionarily divergent 
member of the single-subunit RNAP family (39). 

In vitro, vRNAP shows peculiar template specificity; it is 
inactive on linear double-stranded templates but transcribes 
denatured genomic N4 DNA or single-stranded promoter- 
containing DNA with in vivo specificity (26, 34). vRNAP 
recognizes three promoters, Pel, Pe2, and Pe3, located at 
the left end of the genome (34). Promoter Pel is located 
within the terminal repeat and, therefore, a second copy 
exists at the right end of the genome (58). The three early 
promoters span positions —17 to +1 relative to the transcrip¬ 
tion start site and share blocks of conserved sequences and 
a set of inverted repeats centered at —11 (figure 21-5 A) (34). 




TxxGR B 
998 V Y 


BACTERIOPHAGE N4 305 


1 


2,103 


A A 

A C 


3,500 
H vRNAP 


Mini-vRNAP 


Figure 21-4 vRNAP and mini-vRNAP. Location of the transcriptionally active domain (mini-vRNAP) and the TxxGR, A, B, and C 
motifs in the vRNAP polypeptide. VRNAP stands for virion-encapsidated RNA polymerase. 
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Figure 21-5 Phage N4 vRNAP promotion. A: Consensus seguence of the N4 virion-encapsidated RNA polymerase (vRNAP) 
promoters. Nonconserved positions in the inverted repeats are indicated by X:X'. The site of transcription initiation is 
indicated by +1. Boxed sequences are required for high hairpin stability and extrusion. B: Proposed model for the pathway 
of vRNAP polymerase-promoter utilization. The stoichiometry of Eco SSB in the activated complex is unknown. 


All information necessary for promoter recognition and 
specific initiation is contained in the promoter template 
strand (27). Base changes in the nonconserved positions of 
the inverted repeats (X:X') that disrupt potential base¬ 
pairing between the repeats leads to a severe decrease in 
promoter activity, suggesting that a hairpin structure is 
required for vRNAP-promoter recognition (27). Chemical 
and enzymatic footprinting confirmed that the inverted 
repeats, when present on single-stranded DNA templates, 
base-pair to form a hairpin structure with a 5 bp stem and 
3 base loop (27). Run-off transcription experiments using 
templates containing mutant promoter sequences defined 
specific positions required for transcription (27). Determina¬ 
tion of binding affinity (JC d ) as well as UV crosslinking 
(312 nm) to deoxyoligonucleotide templates containing 5- 
iodo-deoxyuracil at positions within the promoter revealed 
that a purine at position —11 presented in the context of a 
hairpin loop is essential for promoter binding (E. Davydova, 
unpublished data). Regions of the transcriptionally active 
vRNAP domain that contact positions —11 and the hairpin 
stem were determined through crosslinking experiments 
using mini-vRNAP. Position —11 is contacted by amino 
acids present in the aa 83-200 interval (E. Davydova and 
K. M. Kazmierczak, unpublished data), while the hairpin 
stem is contacted by amino acids present in the aa 812-918 


interval (figure 21-6) (E. Davydova, unpublished data). 
In contrast, promoter recognition in the distantly related T 7 
RNAP is conferred primarily by the specificity loop, located 
at aa 739-769 (figure 21-6) (6, 68, 71). In contrast to T7 
RNAP, mini-vRNAP-promoter complexes are stable and 
resistant to dissociation by 2 M NaCl (17). 

In vivo, early transcription is sensitive to E. coli DNA 
gyrase inhibitors, suggesting that the phage genome must 
undergo a structural change within the host cell in order 
to become a suitable template for vRNAP (26). However, no 
transcription was observed in vitro on promoter-containing 
plasmids of superhelical densities less than —0.071, twice 
the level of supercoiling found in the cell (16). A requirement 
for a second host factor, Eco SSB, was discovered upon N4 
infection of a host strain carrying the conditional ssb-l 
mutation; no N4 early transcription was observed in ssb-l 
cells grown at the nonpermissive temperature (51). In vitro, 
Eco SSB stimulated specific transcription from promoters 
located on plasmids of physiological superhelical density 
(16, 51). 

vRNAP transcripts generated from single-stranded 
templates are retained as RNA-DNA hybrids, indicating 
that vRNAP cannot displace the RNA product (26). Tran¬ 
script release, as assayed by sensitivity to SI or RNase H 
digestion, is observed in the presence of Eco SSB during 
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Figure 21-6 Sequence comparison of mini-vRNAP, T7 RNAP and N4 RNA polymerase II. Locations and sizes of conserved 
sequence blocks are schematized. Numbers indicate the known boundaries of structural domains in T7 RNA polymerase (36) 
and inferred boundaries in N4 mini-vRNA polymerase and N4 RNA polymerase II. Numbering in N4 RNA polymerase II reflects 
the positions of these blocks in the p7/p4 fusion polymerase sequence. Residue 270 corresponds to amino acid 1 of p4. Color 
coding is as follows: N-terminal domain, diagonally hatched; thumb, perpendicularly hatched; palm, black; palm insertion 
domain, white; fingers, gray; and foot module, pixels. The locations of polymerase motifs and catalytically important 
residues within them are indicated. 


synthesis from a supercoiled double-stranded plasmid (51) 
or from promoter-containing oligonucleotide templates (18). 
Therefore, Eco SSB stimulates transcription through temp¬ 
late recycling. 

Eco SSB activation of vRNAP transcription is surp¬ 
rising because binding of SSBs to single-stranded DNA 
normally leads to melting of DNA secondary structures. 
However, enzymatic and chemical footprinting of Eco SSB 
on promoter-containing single-stranded DNA templates 
indicates that it is unable to melt the promoter hairpin 
upon binding, in contrast to other SSBs that do not activate 
transcription (fd gpV T4 gp32, T7 gp2.5, N4 SSB) (28). 
The 177 aa Eco SSB polypeptide can be subdivided into a 
120 aa N-terminal domain responsible for DNA binding 
(84), and a C-terminus consisting of a proline- and glycine- 
rich region followed by an acidic 10 aa tail highly conserved 
in eubacterial SSBs (49). Human mitochondrial SSB (Hsmt 
SSB) possesses sequence and structural similarity to the 
N-terminal domain of Eco SSB but lacks the conserved 
C-terminal domain (13, 83, 87). Hsmt SSB did not activate 
vRNAP transcription although it did not disrupt the promo¬ 
ter hairpin (18). Through the use of C-terminally truncated 
Eco SSB proteins and Hsmt SSB-Eco SSB C-terminal domain 
chimeras (12), we have shown that the last 10 aa of Eco SSB 
are essential for activation (18). We have been unable to 
detect vRNAP-Eco SSB interactions using purified proteins; 
however, it is possible that interactions only occur in the 
context of promoter-vRNAP or elongation complexes. 


To explain the generation of a functional early pro¬ 
moter in vivo, we proposed that introduction of negative 
supercoiling by DNA gyrase induces the extrusion of a cruci¬ 
form structure at the promoter that is invaded by Eco SSB 
to yield an “activated promoter” (figure 21-5B) (16). To test 
this model, plasmid minicircles comprised of N4 promoters 
Pel and Pe2 and their downstream ORFs were generated 
in vivo, isolated, and minicircle topoisomers of known 
superhelical densities prepared (54). The single-stranded 
DNA-specific probes, chloroacetaldehyde and mung bean 
nuclease, and T7 endonuclease, which recognizes DNA 
four-way junctions, were used to detect supercoiling- 
induced structural changes at the promoters. Supercoilng- 
and Mg 2+ dependent cruciform extrusion was observed 
at physiological superhelical densities; surprisingly, only 
the nontemplate-strand hairpin loop is sensitive to single- 
stranded DNA-specific probes (15). A mutant promoter 
(P2flip), in which the bases within the loops of the template 
and nontemplate strand hairpins were switched, showed a 
corresponding switch in sensitivity to single-stranded DNA- 
specific probes. These results suggested that the loops of the 
two hairpins differ in conformation, a hypothesis supported 
by nuclear magnetic resonance data (10, 88; M. Kloster, 
unpublished data). 

Mutational and extrusion analyses identified seque¬ 
nces essential for promoter cruciform extrusion. Extrusion 
requires a minimum stem length of 4 bp, provided that 
the stem is composed exclusively G:C base pairs, and that 
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the loop and loop-closing base pair have the sequence 5'-C- 
GDA-G-3' (where D = G, A, or T) (figure 21-5A) (15). This 
sequence has been shown to yield unusually stable hairpin 
(14, 36). These results, in addition to studies of promoter 
activity on single-stranded templates, reveal that the 
conserved sequences within the vRNAP promoters are 
important both for formation of the hairpin structure and 
for specific vRNAP-promoter contacts (15,16, 27). 

Analysis of the sequences downstream of ORF1 and 
ORF2, transcribed from Pel and Pe2 respectively, revealed 
eight or nine base palindromes followed by four or five 
thymidines. In vitro transcription of single-stranded or 
supercoiled, SSB-activated templates containing this 
region of the N4 genome led to the synthesis of defined 
transcripts initiating at Pel and Pe2 and terminating at tl 
and t2, respectively (15, 27). Therefore, vRNAP recognizes 
eubacterial factor-independent-like termination signals. 

Middle Transcription 

Transcription of coliphage N4 middle messenger RNAs 
requires the activities of three early proteins: pl7, p7, 
and p4 (73, 89, 92). Two of these proteins, p7 (30kDa) and 
p4 (40 kDa), have been purified to homogeneity, and 
constitute a heterodimeric, rifampicin-resistant RNA poly¬ 
merase, N4 RNA polymerase II (RNAPII) (90). However, this 
heterodimer does not bind to double-stranded, promoter- 
containing templates and transcribes promoter-containing 
single-stranded DNAs with low efficiency and no specificity 
(1, 90). RNAPII subunits p7 and p4 are tightly associated 
and copurify both from phage-infected cells (90) and, when 
overproduced, from plasmid-cloned genes (4). The p7 sub¬ 
unit was cloned with an N-terminal hexahistidine-tag; 
upon overproduction of the proteins, both hexahistidine- 
tagged p7 and native p4 were retained on a metal affinity 
column. Furthermore, the complex was resistant to dis¬ 
sociation with 1M NaCl (4). 

Sequencing of the genes encoding p7 and p4 identified 
ORFs 15 and 16, respectively (85). ORFs 15 and 16 display 
extensive sequence similarity to separate, nonoverlapp¬ 
ing regions of T7 RNAP and other members of the single¬ 
subunit DNA-directed RNA polymerase family. RNAPII 
contains nearly perfect matches to four sequence motifs 
(DxxGR, A, B, and C) that are important for polymerase 
activity (figure 21-6) (85). The crystal structure of T7 
RNAP resembles a cupped hand with thumb and finger 
subdomains that rise above the palm (38). The p7 subunit 
contains sequences corresponding to the N-terminal 
domain, the thumb, and a fragment of the palm in which the 
DxxGR motif is located, while p4 contains the remainder of 
the palm and the fingers subdomains, and all catalytically 
important residues within motifs A, B, and C (figure 21-6). 
This suggests that a gene fusion or gene splitting event 
occurred during the evolution of this class of polymerases. 


No sequence similarity to the N-terminal 264 aa of T7 
RNAP was detected in RNAPII: in fact, the RNAPII 
N-terminus appears to be truncated by 156 aa relative to 
T7 RNAP (figure 21-6) (85). The T7 RNAP N-terminal 
324 aa are involved in RNA binding, promoter recognition, 
unwinding and processive transcription (7, 30, 56,75). 

A consensus promoter sequence was derived by 
comparison of sequences upstream of six in vivo RNAPII 
transcription initiation sites (2). The recent availability of 
the complete N4 genome sequence led to the reexamination 
and identification of all middle transcription initiation sites 
(M. Hammer, unpublished). Through comparison of the 
activity of middle transcription initiation sites and genetic 
analysis of several middle promoters, a minimal N4 middle 
promoter consisting of an eleven nucleotide sequence 
directly upstream and inclusive of the initiating nucleotide 
(5' tTttttTGA/GxG/T3', +1 underlined) was identified. These 
data indicate a promoter comparable in size to that of 
yeast mtRNAP (3). 

P17 is essential in vivo for N4 middle RNA synthesis (89). 
In wild-type phage-infected cells, RNAPII and pl7 are tightly 
associated with an inner membrane/N4 DNA complex (23). 
Upon infection with phage containing an amber mutation 
in the gene encoding pl7, RNAPII is found in the soluble 
fraction (89). P17 can be released from the DNA/membrane 
complex of N4-infected cells with 0.5 M NaCl. These 
results suggested that pl7 might be a DNA binding protein 
that localizes RNAPII to DNA and confers promoter 
specificity. Yeast mitochondrial RNAP, which also belongs 
to the single-subunit family of DNA-dependent RNA poly¬ 
merases, does not recognize its cognate promoters (40, 53): 
the specificity factor Mtflp is required for promoter recog¬ 
nition (37, 76, 86). ORF2 encodes pl7 (henceforth referred to 
as gp2), a 128 aa polypeptide that possesses no sequence 
similarity to proteins in the database (4). Gp2 was purified 
to homogeneity and assayed for binding to middle-promo¬ 
ter-containing DNA fragments. Gp2 was purified to homo¬ 
geneity and assayed for binding to middle-promoter- 
containing DNA fragments. Gp2 does not form complexes 
with promoter-containing double-stranded DNA: however, 
it does bind to single-stranded DNA nonspecifically with 
high affinity (K d = 20-60 nM) (4). Indeed, gp2 displays all 
properties characteristic of a single-stranded DNA binding 
protein (4). 

In vitro, purified gp2 does not enable RNAPII to transcribe 
double-stranded, promoter-containing templates. However, 
it does activate nonspecific transcription by RNAPII 
on single-stranded templates (4). Gp2 activation is pro¬ 
nounced at low RNAPII concentrations and is absent at 
high RNAPII concentrations, suggesting that this protein 
might activate transcription through recruitment of poly¬ 
merase to single-stranded templates (4). Gp2 and RNAPII 
bind to single-stranded DNA synergistically, suggesting 
the proteins do interact (4). Moreover, gp2 incubated with 
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hexahistidine-tagged RNAPIL before metal-affinity chroma¬ 
tography is retained with the polymerase on cobalt-agarose 
resin, further supporting an interaction between the 
proteins. After washing with 1 M NaCI, p4, p7 and gp2 are 
eluted in equimolar amounts (4). Other SSBs tested are not 
retained on the metal affinity column, indicating that gp2 
and RNAPIL interact specifically. 

In vivo, expression of gp2 and N4 RNAPII are sufficient, 
for recognition when N4 middle promoters are present on 
plasmids (M. Hammer, unpublished data). The mechanism 
of promoter recognition and the role of gp2 in this pro¬ 
cess are under investigation. Moreover, whether gp2 simply 
recruits RNAPII to the promoter or plays an additional role 
in open complex formation remains to be determined. 

Late Transcription 

N4 late RNA synthesis, unlike early and middle trans¬ 
cription, is abolished by rifampicin (Rif) treatment in Rif- 
sensitive E. coli hosts but is unaffected in Rif-resistant 
hosts, indicating that E. coli RNA polymerase synthesizes 
late RNAs (92). No phage late messenger RNA synthesis 
is detected at the restrictive temperature in host strains 
carrying temperature-sensitive mutations in the a, (3, (3', 
and er /0 subunits of E. coli RNA polymerase, further confirm¬ 
ing that the o'° holoenzyme is responsible for phage late 
transcription (92). 

All sites of late transcript initiation have been identi¬ 
fied by SI nuclease mapping (8) and/or primer extension 
(M. Hammer, unpublished data). Late promoter regions 
show weak homology to the E. coli cr 70 -RNAP consensus 
— 10 and —35 elements and possess no other blocks of 
conserved sequence. E. coli ct / 0 holoenzyme is inactive on 
linear DNA templates containing late promoters: however, 
it is active on supercoiled templates, raising the possibility 
that activation of late transcription might require a change 
in the topology of the DNA template (but see below) (8). 

N4 late transcription begins shortly after the onset of 
phage DNA replication although concomitant or previous 
DNA replication is not required (8, 92). Mutant phage defi¬ 
cient in the N4-coded DNA polymerase or DNS, which are 
essential for N4 DNA replication, support late transcription 
albeit at a reduced rate (8). However, no late transcription 
occurs in cells infected with mutant phage deficient in N4 
SSB, suggesting that this protein is required for both 
N4 replication and late transcription (8). Purified N4 SSB 
does not interact with double-stranded DNA, and binds to 
single-stranded DNA cooperatively (47). In vitro, N4 SSB 
activates E. coli ct / 0 -RNAP transcription at late promoters 
present on linear templates (8). 

ORF45 encodes N4 SSB, a 265 aa protein with no 
sequence similarity to other single-stranded DNA binding 
proteins (9). A systematic mutational analysis of N4 SSB 
defined two functional domains separated by a linker 
region: a large N-terminal DNA binding domain and a short 


C-terminal domain implicated in protein-protein inter¬ 
actions (A. Miller, M. Choi, D. Wood, unpublished data). 
In vivo, mutations in N4SSB that affect single-stranded 
DNA binding (Y75A and Y128A) impair phage DNA repli¬ 
cation and recombination but do not affect late trans¬ 
cription (55). Residues at the extreme C-terminus of N4 
SSB are required for N4 DNA replication, recombination, 
and late transcription in vivo. Two mutant proteins A 264- 
265:S260A and K264.265A, are proficient in both replica¬ 
tion and recombination but deficient in transcriptional acti¬ 
vation both in vivo, and in vitro, suggesting that residues 
S260, K264, and K265 play a specific role in this activation. 

N4 SSB was retained on a metal affinity column 
containing either immobilized hexahistidine-tagged E. coli 
holoenzyme or core (55). Similar results were obtained 
with RNAP purified directly from cells or reconstituted 
from overexpressed subunits, indicating that no additional 
cellular proteins are required for this interaction (A. Miller, 
unpublished data). N4 SSB mutant proteins proficient 
in transcription activation but impaired in single-stranded 
DNA binding were retained on the RNAP column, while 
mutant proteins deficient in late transcriptional activa¬ 
tion were not (55). Therefore, the observed interaction is 
specific and reflects contacts made between RNAP and N4 
SSB that are necessary for transcriptional activation. 

N4 phage cannot be propagated in an E. coli rpoC ts strain, 
397C, even at the permissive temperature (92). Specifically, 
E. coli 397C cannot support late transcription, as demon¬ 
strated by primer extension analysis (55). The rpoCts397C 
subunit has a 5 amino acid deletion at position 1354, fol¬ 
lowed by 23 out-of-frame amino acids; consequently, the 52 
normally occurring C-terminal amino acids of (3' are not 
present (11). Crosslinking experiments were performed to 
define the region of RNAP that interacts with N4 SSB. N4 
SSB crosslinked to the 109 aa C-terminal peptide of the (3' 
subunit (55). Therefore, N4 SSB activates E. coli ct / 0 -RNAP 
at N4 late promoters through a direct contact with the (3' 
subunit (55). 

Although activators can interact with RNAP in the 
absence of their cognate binding sites (52,63), specific activa¬ 
tion normally requires DNA binding specificity. N4 late 
promoters do not share any conserved sequences that 
might serve as an activator-binding site. How then does N4 
SSB achieve specificity for activation at late promoters? 
Specificity may be the result of "indirect readout” of promo¬ 
ter sequences, in which N4 SSB preferentially recognizes 
a specific conformation that RNAP adopts only at N4 SSB 
dependent promoters. It could also be achieved kineti- 
cally, with N4 SSB affecting a step in transcription initiation 
that is limiting only at N4 SSB-RNAP complex may not 
exclusively transcribe from phage late promoters; such 
limited specificity might be acceptable for a viral activator 
acting late in a lytic cycle (55). Activators function 
by recruiting RNAP to the promoter (66) or by facilitat¬ 
ing post-recruitment steps (e.g., open complex formation, 
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promoter clearance) (3, 57, 64). Because N4 SSB activates 
transcription without binding to DNA, it most likely acts at 
post-recruitment steps. 

N4 Replication 

Host and Phage Functions Required for N4 

DNA Replication 

E. coll temperature-sensitive mutants defective in a num¬ 
ber of replication functions were tested for their ability 
to support N4 growth and to permit incorporation of 
[ 3 IT]thymidine into N4 DNA at the nonpermissive tempera¬ 
ture (32, 69). Three host functions ribonucleotide reductase 
(dnaF), DNA Iigase (lig), and DNA gyrase igyrB) —were 
found to be required. E. coli cells defective in the polymerase 
activity of DNA polymerase I ( polAl ) were able to support 
phage growth, but cells defective in Pol Is 5'-3' exonuclease 
activity ( polAexl ) could not, though [ 3 H]thymidine incor¬ 
poration into phage DNA was observed in this strain. Analy¬ 
sis of replication products by alkaline sucrose gradient 
centrifugation suggests that the host exonuclease activity is 
required for processing of phage Okazaki fragments (32). 
The functions of the E. coli dnaA, dnaB, dnaC, dnaE, and 
dnaG genes are not required for N4 DNA synthesis, 
suggesting that the phage must encode the corresponding 
proteins required for replication initiation, priming, DNA 
unwinding, and polymerization. Five N4 gene products are 
required for in vivo N4 DNA replication: a DNA polymer¬ 
ase (N4 DNAP, ORF39), a single-stranded DNA binding 
protein (N4 SSB, ORF45), a 5'-3' exonuclease (N4 5'-3' 
EXO, ORF37), a protein of unknown function (DNS, ORF43) 
and vRNAP (ORF50) (32,69). 

An in vitro DNA replication system that utilizes a crude 
extract prepared fron N4-infected E. coli polAl cells was 
developed (70). This system requires exogenously added 
N4 DNA but shows little or no activity on other templates. 
The activities of the N4 DNAP, N4 SSB and N4 5'-3' EXO 
are also required: therefore, these proteins were purified to 
apparent homogeneity by in vitro complementation (70). 

N4 DNAP is monomeric in solution and has an apparent 
molecular weight of 87,000: this is lower than the mole¬ 
cular weight calculated based upon its sequence (888 aa, 
MW 101,262). N4 DNAP is homologous to members of the 
Pol I DNAP family. It absolutely requires a ribo- or deoxy- 
riboprimer; small gapped DNAs serve as preferred templates. 
The N4 DNAP polypeptide has a strong Mg 2+ -dependent 3'- 
5' exonuclease activity that can be suppressed under poly¬ 
merization conditions (46). The enzyme exhibits very high 
mismatch excision activity even at high nucleotide concen¬ 
trations, a condition that normally favors polymerization 
over excision. Measurement of in vitro base substitution 
fidelity indicates that N4 DNAP is 16-fold more active in this 
assay than E. coli pol I Klenow fragment (46). As is true of 


other replicative DNA polymerases, N4 DNAP lacks 5'-3' 
exonuclease and strand displacement activities and is 
nonprocessive, indicating the requirement for accessory 
factors. N4 SSB increases N4 DNAP’s processivity 300-fold. 
It is unknown whether this stimulatory effect is the result 
of a specific DNA conformation elicited by N4 SSB binding 
or caused by SSB-DNAP interactions (47). Processivity 
factors commonly associated with replicative polymerases 
either form a structure that encircles the DNA or, in the 
case of T7 DNAP and its accessory factor thioredoxin, bind 
to DNAP and modify its structure so that it is able to encircle 
the DNA (19, 41). It is difficult to predict whether a clamp 
is encoded by the N4 genome; although clamps from differ¬ 
ent organisms display remarkable conservation of structure, 
they do not share sequence similarity (35). 

As mentioned previously, N4 SSB is a 265 aa protein 
that binds to single-stranded DNA cooperatively and more 
tightly than to RNA. N4 SSB has a binding site size of 11 
nucleotides, an intrinsic binding constant of 3.8 x 10 4 M _1 
and a cooperativity value (co) of 300 when it binds to single- 
stranded DNA in 0.22 M NaCl at 37 °C (47). The in vivo 
concentration of N4 SSB is calculated to be 9-8 pM (47). 
In light of its binding site size and affinity, we predict that 
there is sufficient N4 SSB present in N4-infected cells to 
saturate all single-stranded DNA available during phage 
replication. 

The N4 5'—3' exonuclease has a denatured molecular 
weight of 45,000 and exists as a dimer in solution (31). 
Its preferred substrate is duplex DNA containing 3' exten¬ 
sions (i.e., N4 DNA), which it degrades to 5' mononucleotides 
by a distributive mechanism. It is inactive at nicks or gaps 
and on single-stranded DNA (31). These properties suggest a 
role for the exonuclease in recombination or replication 
of the ends of the N4 genome. The 439 aa exonuclease is 
encoded by ORF37, which displays striking sequence simi¬ 
larity to the T4-encoded DDA helicase. It is unknown 
whether the ORF37 product has intrinsic helicase activity. 
The phenotype of an exonuclease mutant suggests that 
the N4 enzyme serves a very different purpose in phage 
development than other phage-coded exonucleases. In vitro, 
no detectable exonuclease activity is found in extracts of 
nonsuppressor strains infected with N4 exoamDll phage 
(31). In vivo, phage DNA synthesis proceeds normally in 
exoamDll infected cells at 37 °C for 60 minutes, then stops. 
At 42°C, replication begins but ceases within a few minutes 
(D. Guinta, S. Spellman, unpublished data). The possible role 
of the exonuclease in N4 infection will be discussed below. 

At least two additional N4-coded functions are re¬ 
quired for N4 DNA replication in vivo: the ORF43 product 
(DNS) and vRNAP (32). ORF43 encodes a 715 aa protein 
with limited sequence similarity to an ORF located within 
the S. aureus spi pathogenicity island that has a putative 
role in replication. Analysis of phage encoding the vRNAP 
mutation £sl50 revealed a role for vRNAP in N4 DNA 
replication. vRNAP purified from tsl50 phage virions is 
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sensitive to thermal inactivation, while enzyme purified 
from wild-type virions is temperature resistant (24). Shifting 
a culture of N4 ts 150-infected cells from the permissive to 
the restrictive temperature 25 minutes post-infection results 
in a rapid decrease in e rate of [ 3 H]thymidine incorporation 
into replicating phage DNA, thus suggesting that vRNAP 
plays a role in phage DNA replication. Interestingly, the 
tsl50 mutation (H328Y) does not lie in the transcrip¬ 
tionally active central domain of the vRNAP polypeptide 
but instead is located in the amino-terminal domain 
(K.M.K., unpublished). 

Mechanism of N4 DNA Replication 

Replication in the in vitro system initiates at each end of 
the N4 genome and proceeds inward (70). Two-dimensional 
gel electrophoresis of restriction fragments from in vitro 
replicating N4 DNA molecules suggests that initiation 
occurs through hairpin priming of the genome’s single- 
stranded ends (70). The in vitro replication systems depen¬ 
dence on the N4 5'—3' exonuclease and the pattern of N4 
DNA replication observed in N4exoamDll infected cells at 
42 °C might reflect a requirement for nucleotide removal 
from the genomes 5' recessed ends so that hairpin format¬ 
ion and priming can occur. Electron microscopy of in vivo 
replicating N4 DNA revealed Y-shaped molecules and 
molecules with single-stranded tails. Strikingly, no replica¬ 
tion bubbles were observed. These results strongly suggest 
that replication origins lie near or at the ends of the 
genome (48). However, the use of RNA primers for replica¬ 
tion initiation has not been ruled out; indeed, N4 vRNAP’s 
role in replication has not been elucidated. It is possible 
that vRNAP primes replication from promoter Pel, located 
in the genomes terminal repeats (32). The accumulation of 
Okazaki-like fragments after infection of polAexl or lig mut¬ 
ants suggests discontinuous DNA synthesis (32), though 
an NT-coded primase activity has not been identified. 

Longer than unit-length molecules were observed when 
replicating DNA was analyzed by electron microscopy (48). 
Restriction enzyme analysis identified a restriction frag¬ 
ment containing one copy of the terminal repeat Hanked 
by the right and left ends of the genome (i.e., the joint frag¬ 
ment) (48). Furthermore, results of in vivo pulse-chase 
experiments show that label accumulates in the joint frag¬ 
ment and is chased into both ends of mature DNA molecules, 
strongly suggesting that concatemers of genomic DNA 
exist during N4 replication. It is likely that these originate 
through homologous recombination at the terminal repeats. 

The 3' extensions of the N4 genome are unusual; since 
all known DNA polymerases synthesize in a 5' to 3' direc¬ 
tion, N4 must have a mechanism for generating these 
sequences. A model in which simple site-specific cleavage 
of concatemeric DNA during packaging yields the correct 
3' extensions at each end in one step is attractive, but by 


necessity implies that 50% of all synthesized DNA molecules 
would not be packaged into virions. Such a model, however, 
could account for the unit-length DNA lacking direct 
repeats observed in N4-infected ells (48). Moreover, it could 
explain the phenotype of exoamDll infection at 37 °C. The 
N4 exonuclease may degrade alternate copies of genomic 
DNA in a concatemer in order to provide deoxynucleotides 
for continued DNA synthesis. Unlike other phage, N4 does 
not degrade the host chromosome during infection (32, 73) 
and consequently requires the activity of E. coli ribo¬ 
nucleotide reductase to provide a nucleotide pool (32). 
Perhaps a supplementary source of nucleotides is required 
late in phage infection, and in exoamDll phage infection 
DNA synthesis stops as a result of nucleotide depletion. 

N4 Genetics 

Analysis of N4 development has been hindered by the lack 
of a genetic map; very high recombination levels have 
prevented the generation of a linkage map. Hpal fragments 
of the N4 genome have been cloned into pBR322 (50). Two 
regions of the genome originating within Hpal fragments 
M (ORFs 6, 7, and 8) and D (N4 SSB, ORFs 46 and 47) 
have been cloned only under tightly repressed conditions, 
indicating that they encode N4 functions lethal to the 
host. Hpal M is transcribed early in infection and could 
encode the protein responsible for inhibiting host DNA 
replication. Hpal fragment D encodes N4 SSB, which is 
lethal to the host even at low expression 

Heat- and citrate-induced deletions defined nonessen¬ 
tial regions of the N4 genome (50). Deletions were obtained 
with high frequency in Hpal fragment A. The largest deletion 
spans a 6 kb region that includes ORFs 33 and 34, which 
encode proteins displaying sequence similarity to T4 RIIA 
andT4 RUB (50; E. Stojkovic, unpublished data). 

Morphogenesis 

N4 does not encapsidate its DNA by full-head packaging 
but through the recognition of specific sequences at both 
ends of the genome, as indicated by our ability to isolate 
heat- and citrate-resistant N4 phage (60). As expected, 
phage genomes containing deletions have wild-type ter¬ 
minal sequences (50). The genes for known virion proteins 
have been identified, though the pathway of N4 virion 
assembly and DNA encapsidation is undefined. Recently, 
we addressed the possible role of the multifunctional 
vRNAP polypeptide in phage morphogenesis. The N- and 
C-terminal domains of vRNAP could potentially be impor¬ 
tant for encapsidation of vRNAP into virions, processing of 
DNA from concatemers and genome packaging into virions, 
or injection of vRNAP and the genome into the host cell. 
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Infection of nonsuppressing E. coli strains with phage 
containing amber mutations in vRNAP yields progeny 
phage apparently containing all virion proteins except 
vRNAP (I. Kaganman, unpublished data). These virions are 
morphologically indistinguishable from wild-type virions by 
electron microscopy, and contain one copy of N4 genomic 
DNA possessing wild-type ends, indicating that vRNAP is 
not involved in concatemer processing or DNA packaging. 

N- and C-terminally truncated forms of the vRNAP poly¬ 
peptide were expressed from plasmids during infection with 
vRNAP amber mutant phage; only polypeptides which 
contain the full vRNAP C-terminal domain were packaged 
into virions, indicating that this domain is necessary and 
sufficient for vRNAP encapsidation. Results of subsequent 
infection experiments indicate that the vRNAP’s N-terminal 
domain is required for injection of the genome into the 
host cell (I. Kaganman, E. Davydova, A. Demidenko, unpub¬ 
lished data). 


Lysis 

N4 infection displays a lysis-inhibited phenotype (62, 80). 
However, treatment of N4-infected cells with chloroform 
induces lysis when added as early as 20 minutes after infec¬ 
tion, suggesting that N4 encodes a lysin (E. Stojkovic, 
unpublished data). A late gene product must cause lysis 
because cells infected with N4 SSB mutant phage do not 
lyse upon chloroform addition. 

Sequence analysis of N4 late genes identified ORF61 as 
a possible lysis gene due to its limited sequence similarity to 
a spore-cortex lytic enzyme from Clostridium perfringens. 
The 208 aa ORF61 protein does not display sequence simi¬ 
larity to other phage lysins but is homologous to hypotheti¬ 
cal proteins encoded by the genomes of Novosphingobium 
aromaticivorans, Salmonella typhimurium, Salmonella typhi, 
Neisseria meningitidis, Brucella melitensis, Mesorhizobium 
loti, and Rhodobacter capsulatus. 

ORF61 was cloned under the control of an inducible 
promoter. Cell lysis was detected 20 minutes after induction 
of gp61 synthesis, indicating that the ORF61 gene product 
does possess lysis activity and that this activity is able to 
access the cell peptidoglycan layer (E. Stojkovic, unpub¬ 
lished data). Gp61 is predicted to possess a positively 
charged N-terminus followed by a transmembrane domain. 
These features suggest that the protein is membrane 
bound with “N-in, C-out” topology. Proteins containing 
PhoA-fusions to the gp61 transmembrane domain or to 
its C-terminus displayed alkaline phosphatase activity, 
indicating that the PhoA domain was localized to the 
periplasm (E. Stojkovic, unpublished data). C-terminally 
hexahistidine-tagged recombinant gp61 purified to homo¬ 
geneity degrades purified peptidoglycan, confirming that 
gp61 is a lysin (E. Stojkovic, unpublished data). 


Analysis of muropeptides generated from gp61 digestion 
of E. coli peptidoglycan indicates that gp 61 is an N-acetyl 
muramidase; its homology to a number of hypothetical 
proteins defines a new family of lysins. The existence of an 
NT-encoded lysin raises the question of why N4 infection 
is lysis-inhibited (80). Inspection of the surrounding ORFs 
revealed that ORF61 is preceded by a small ORF (ORF62) 
containing two potential transmembrane domains, and 
followed by two overlapping ORFs (ORF 60 and ORF 60'), 
one of which (0RF60') is out of frame. The organization 
of ORFs 60 and 60' is reminiscent of the X genes encoding 
Rz and Rzl. ORFs 63-60 are transcribed from a promoter 
located immediately upstream of a transcription terminator 
(E. Stojkovic, unpublished data). The roles of these ORF 
products in regulation of the gp61 lysin activity during 
N4 infection remain to be determined. 

Prospects 

We have presented our current understanding of bacterio¬ 
phage N4. Sequencing of the genome has provided more 
questions than answers. What little we have learned already 
shows that N4 is unique among lytic phages. Important 
questions to be answered in the near future are: 

1. What are the mechanisms of vRNAP and genome 
injection into the host? What is the exact role of the 
N-terminal domain of the vRNAP polypeptide in the 
process? What are the roles of NfrA and NfrB? Are 
additional host functions, i.e., chaperones, required 
for injection? 

2. What is the structure of the central active domain of 
vRNAP (mini-vRNAP)? How does it interact with the 
promoter hairpin, and what conformational change 
occurs upon promoter binding to yield a salt resistant 
complex? What is the structure of the N4 early region 
in vivo that allows extrusion of the promoter hairpins? 
What is the in vivo structure of the activated early 
promoter and where is Eco SSB bound? 

3. How is N4 RNAPII endowed with promoter specifi¬ 
city? What is the role of gp2 in this process? Do 
RNAPII accessory factors play other roles in addition 
to promoter recognition? 

4. What is the mechanism by which N4 SSB activates 
E. coli ct /(, -RNA polymerase at late promoters? 

5. How is N4 replication initiated in vivo? What proteins 
are required for origin activation? What additional N4 
functions are required at the replication fork? What is 
the role of vRNAP in N4 DNA replication? 

6. How are the 3' single-stranded ends of mature N4 
DNA generated? 

7. Where is vRNAP localized in virions? With what 
protein(s) does vRNAP interact in virions? 

8. How is cell lysis regulated in N4-infected cells? 
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Phage 4>29 and its Relatives 
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P hage ((>29 and the related phages PZA, PZE, 4>15, BS32, 
B103, Nf, M2Y, and GA-1 are lytic phages that belong to 
the Podoviridae family. Most of them infect Bacillus subtilis 
and other related species such as Bacillus amyloliquefaciens, 
Bacillus pumilus, and Bacillus licheniformis. The host range 
of phage GA-1 differs from those of the other phages. Thus, 
GA-1 infects the Bacillus strain G1R. The <f>29-related phages 
have been classified in three groups based on serological 
properties, peptide maps, DNA physical maps, and DNA 
sequences (11, 90,153). The first group includes phages <j>29, 
PZA, PZE, (|)15, and BS32; the second includes phages B103, 
Nf, and M2Y; and the third group contains phage GA-1 as the 
only member. 

The genome of the 4>29-related phages, which are the 
smallest of the Bacillus phages isolated so far, consists in a 
linear double-stranded DNA of 19-21 kb with a phage- 
encoded protein, named terminal protein (TP), covalently 
linked to each 5' end. The complete DNA sequence of 
phages 4)29 (19,285 bp), PZA (19,366 bp), B103 (18,630 bp), 
and GA-1 (21,129 bp) are known (90, 109, 112, 149). Phage 
4)29 and its relatives have a short, inverted terminal repeat, 
six nucleotides long (5'-AAAGTA) for cf>29, PZA, <4>15, and 
B103 DNAs, eight nucleotides long (5'-AAAGTAAG) for Nf 
and M2Y DNAs, and seven nucleotides long (5'-AAATAGA) 
for GA-1 DNA. The remainder of the DNA sequence of phages 
4>29 and PZA is very similar, and different from that of phage 
B103. The sequence of GA-1 DNA is less related to that of the 
other phages. 

The phage 4>29 particle consists of a prolate head (54 nm 
long and 42 nm wide), and a neck/tail region (44 nm long) 
(141). The head, formed by protein p8, contains fibers 
formed by protein p8.5. The neck consists of an upper collar 
or connector (plO), required for head assembly, and a lower 
collar (pll) to which 12 spindle-shaped appendages are 
attached (pl2*). These appendages are required for adsorp¬ 
tion of the phage to the bacterial cell wall. The tail is formed 
by protein p9. 

In this chapter I describe the molecular mechanisms 
involved in DNA replication, regulation of transcription, 


and phage morphogenesis, most of which have been eluci¬ 
dated by studying phage 4>29. 


Genetic and Transcriptional 
Organization of 4>29 and its Relatives 

As shown in figure 22-1, the genetic and transcriptional 
maps of phages 4>29, B103, and GA-1, representative of 
groups I, II, and III, respectively, are similarly organized. 
Early genes are located at the two DNA ends and they are 
transcribed from right to left, whereas late genes, located at 
the center of the genome, are transcribed from left to right. 
In the three phages, early genes 2, 3, 5, 6,16.7, and 17 encode 
similar proteins required for DNA replication, and gene 4 
encodes a protein involved in transcription regulation. 
Protein p6 also has a role in transcription regulation, in 
addition to its requirement for DNA replication. Gene 1, also 
involved in DNA replication, is present only in phage 4>29. 
There are other open reading frames (ORFs) in the three 
phages whose function is still unknown. It should be noticed 
that, in accordance with its larger size, GA-1 DNA has more 
ORFs in the region corresponding to early genes than 4)29 
and B103 DNAs. The three phage DNAs contain a region at 
their left end encoding a small RNA, named pRNA, required 
for packaging of the phage DNA. The late genes of the three 
phages are very similar, except that phage GA-1 lacks gene 
8.5, coding for the head fibers, which in 4>29 is a dispensable 
protein. The functions of the different genes of phage 4>29, 
their size, and the percentage similarity with the corre¬ 
sponding genes of phages B103 and GA-1 are given in 
table 22-1. In all cases, late transcription occurs from a 
single promoter, named A3. The major early promoters, 
A2b, A2c, and C2, are also conserved in the three phage 
DNAs. The A1 promoter (Alb in phage GA-1), responsible for 
the synthesis of pRNA, is also conserved in the three phages. 
The remaining, minor promoters are specific to each phage 
(see figure 22-1). 
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Table 22-1 Genes of phage <j>29 and their function 


Gene 

Function 

No. of amino 
acids 

Mol. mass 
(l<Da) 


Similarity (%) a 


4>29/B103 

<j>29/GA-1 

B103/GA-1 

1 

DNA replication 

89 

10.3 

- 

- 

- 

2 

DNA polymerase 

572 

65.2 

88.5 

67.3 

68.0 

3 

Terminal protein 

266 

31.1 

74.1 

51.7 

53.6 

4 

Transcriptional regulator 

125 

15.1 

83.2 

57.6 

58.4 

5 

ssDNA binding protein 

124 

13.4 

73.8 

37.9 

45.1 

6 

dsDNA binding protein 

104 

12.0 

65.0 

52.7 

54.8 

7 

Scaffolding protein 

98 

11.2 

75.5 

48.0 

38.6 

8 

Major head protein 

448 

49.7 

90.4 

68.5 

69.3 

8.5 

Flead fiber protein 

280 

29.6 

63.6 

- 

- 

9 

Tail protein 

599 

67.7 

75.3 

58.3 

58.5 

10 

Connector (upper collar protein) 

309 

35.9 

84.4 

63.7 

63.7 

11 

Lower collar protein 

293 

33.6 

77.8 

51.6 

50.9 

12 

Preneck appendage protein 

854 

92.4 

79.9 

37.8 

37.5 

13 

Morphogenesis (tail assembly) 

365 

41.0 

82.7 

67.0 

67.3 

14 

Holin 

131 

14.9 

87.1 

51.9 

52.3 

15 

Peptidoglycan hydrolase 

258 

26.9 

74.8 

37.2 

40.6 

16 

ATPase, DNA encapsidation 

332 

38.9 

86.3 

67.5 

68.8 

16.7 

DNA replication, membrane protein 

130 

15.2 

68.5 

48.5 

47.7 

17 

DNA replication 

167 

19.4 

48.1 

44.8 

37.9 


Data taken from Salas (121), and Meijer et al. (90). 
ds, double-stranded; ss, single stranded. 

a For the calculation of the percent similarity, the following amino acids were considered to be conservative: L, I, V, A, and M; F, Y, and W; K and R; D and E; Q and 
N; S and T. 
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Figure 22-1 Genetic and trasncriptional maps of <£>29, B103, and GA-1 DNA. The maps are aligned with respect to the A2c, 
A2b, and A3 promoters. The direction of transcription and length of the transcripts are indicated by arrows. The positions of 
genes 16.7 and 17 are indicated. ORFs 16.9, 16.8, 16.6, and 16.5 are indicated by the numbers .9, .8, .6, and .5, respectively. 
Transcriptional terminators are indicated by stem-loop structures. The grey box indicates the DNA region encoding the 
pRNA, and the black box indicates the region containing the early promoters A2c and A2b, and the late promoter A3. 

The figure is adapted from Meijer et al. (90). 
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All early promoters of the three phages have the consen¬ 
sus sequences at the —10 (TATAAT) and —35 (TTGACA) 
regions separated by a spacer of 17-18 nucleotides, except 
the <f>29 minor promoter A1IV that lacks the —10 hexamer. 
In the three phages, the late A3 promoter has a —10 hexamer 
(TATAAT) but lacks the —35 one. Most of the promoters of 
the three phages have the so-called —10 extended region or 
TG motif (TGN-10). The involvement of the TG motif in 
promoter strength has been studied for the 4>29 promoters 
Al, A2c, and A3. Mutation of the TG motif impaired the 
binding of cr A RNA polymerase (RNAP) to each promoter 
(30), suggesting that the TG motif provides additional con¬ 
tact sites for the RNAP. 

Transcriptional Regulation 

Early Promoter C2: Regulation by Protein p6 

The activity of the early promoter C2 of phage <f>29 decreases 
very soon after infection (102). Protein p6, an early double- 
stranded DNA binding protein expressed in high amounts 
and required for the initiation of c(>29 DNA replication, was 
shown to be responsible for in vivo and in vitro repression of 
promoter C2 (3,152). Protein p6 binds to the 4)29 DNA ends 
forming a nucleoprotein complex (see later). Nonetheless, 
the complex formed at the right DNA end does not occlude 
promoter C2 to the RNAP. Protein p6 binding alters the 
structure of the transcriptional complex and affects the 
stability of the closed complex (31). 

The GA-1 C2 promoter is also expressed only early after 
infection. The GA-1 protein p6 inhibits in vitro transcription 
from this promoter (77). 

Early Promoters A2b and A2c and 
Late Promoter A3: Regulation by 

Proteins p4 and p6 

As already mentioned, the late A3 promoter lacks the — 35 
hexamer, so that the RNAP cannot bind to it. Instead, the 
sequence 5'-CTTTTT-15bp-AAAATG-3' forms a recognition 
site for protein p4 at position —82 from the transcription 
start site. This region has an intrinsic curvature of 45° that 
increases up to 80-85° on p4 binding (117). A kinetic analy¬ 
sis of the activation process, as well as band-shift and DNase 
I footprinting assays, showed that the main role of protein p4 
is to stabilize the binding of RNAP to the promoter as a 
closed complex (108) through a specific interaction between 
protein p4 and RNAP. Thus, protein p4 mutants affecting 
residue Argl20 bind to DNA efficiently but they have a 
reduced ability to interact with RNAP and to activate tran¬ 
scription (96). The protein p4 C-end is also known to part¬ 
icipate in the maintenance of DNA bending. Reduction 
of bending was significant when two basic residues were 


simultaneously mutated in any combination. Therefore, the 
C-end of p4 participates both in DNA bending and in tran¬ 
scription activation. Considering that protein p4 is a dimer 
in solution and binds to DNA as a tetramer, a model was 
proposed in which two p4 subunits would be in close contact 
with the DNA, interacting with the DNA backbone and 
bending it. The other two subunits of p4 would be free to 
interact with RNAP through the surface centered around 
Argl20. On the other hand, protein p4 stabilizes the purified 
a-subunit of B. subtilis at the A3 promoter through an inter¬ 
action that is dependent on p4 residue, Argl20. To localize 
the contact site of protein p4 at the RNAP. deletions at the 
C-end of the a-subunit, lacking the 15, 37 or 59 residues, 
were obtained. RNAPs reconstituted with any of the dele¬ 
tions could not recognize the late A3 promoter in the 
presence of p4. DNase I footprinting assays showed that 
protein p4 was unable to stabilize the mutant RNAPs at the 
A3 promoter (95). Therefore, the interaction between p4 and 
RNAP, which stabilizes the latter at promoter A3 to activate 
transcription, is maintained between the protein p4 region, 
containing residue Argl20 and the C-terminal domain 
(CTD) of the RNAP a-subunit. 

The protein p4 binding site at the late A3 promoter over¬ 
laps with the —35 region of the early promoter A2b. In vitro 
studies have shown that p4 represses the A2b promoter by 
excluding the RNAP from the latter, directing it to the late 
promoter. In addition, the curvature induced by p4 binding 
impairs transcription from promoter A2b (116). Activation of 
the A3 promoter of phage Nf, which belongs to the same 
group as phage B103, responds to the 4>29 protein p4 in a 
similar way to that described for the 4>29 promoter A3 
(107). In the case of phage GA-1, the corresponding p4 
protein, named p4 G , binds to a site upstream of the late A3 
promoter that overlaps with the early A2b promoter, 
preventing the binding of the RNAP to the latter and repres¬ 
sing transcription. However, binding of p4 to its site 
upstream the late A3 promoter did not activate transcrip¬ 
tion. Promoter A3 was expressed efficiently in vitro in both 
the absence or the presence of p4 G . Nonetheless, promoter 
A3 of GA-1 was not active in vivo when protein synthesis 
was inhibited. These results lead to the suggestion that the 
GA-1 A3 promoter may be repressed in vivo by a host protein 
and that protein p4 G may act as an antirepressor, allowing 
expression of promoter A3 at late infection times (78). 

Protein p4 also represses the early promoter A2c by bind¬ 
ing to DNA immediately upstream from RNAP in a way that 
does not hinder polymerase binding. On the contrary, the 
two proteins bind cooperatively to DNA. Upstream from the 
A2c promoter there are two protein p4 binding sites, named 
site 1 and site 2. Protein p4 by itself binds to site 1 with low 
efficiency. In the presence of RNAP, protein p4 is displaced 
from site 1 to site 2, increasing the efficiency of binding 
(100). In the presence of p4, RNAP can form an initiated 
complex at the A2c promoter that generates short abortive 
transcripts, but cannot leave the promoter (103), leading to 
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transcription repression by a mechanism different from the 
one that occurs at the adjacent A2b promoter. Thus, repres¬ 
sion of the A2c promoter occurs at the step of promoter 
clearance, while that of promoter A2b takes place by exclu¬ 
sion of the RNAP from the promoter due to the binding of 
protein p4 to its recognition site upstream from the A3 
promoter (site 3), which overlaps with the —35 hexamer 
of promoter A2b. 

The fact that protein p4 and RNAP bind cooperatively to 
the promoter suggested that the repression mechanism 
might involve interaction between the two proteins, indeed, 
mutation of protein p4 residue Argl20, which prevents the 
contact between the two proteins, leads to a loss of repres¬ 
sion. As described above, p4 residue Argl20 is also critical 
for the activation of the late A3 promoter. In this case, 
ArgT20 participates in the stabilization of RNAP at the 
promoter. At promoter A2c, the result of such interaction is 
that protein p4 holds the RNAP at the promoter as an 
initiated complex. Therefore, this region of protein p4 
behaves as an activation surface at promoter A3 and as a 
repression surface at promoter A2c. 

Regarding the interaction with the RNAP, protein p4 
could form a complex at the A2c promoter with the wild- 
type a-subunit but not with a deletion mutant lacking the 
T5 C-terminal amino acids. In addition, promoter repression 
was impaired when a reconstituted RNAP lacking the T5 C- 
terminal amino acids of the a-subunit was used. Protein p4 
could not interact with this mutant RNAP at promoter A2c 
(104). Thus, the C-terminal domain of the a-subunit can 
receive regulatory signals both from transcriptional activa¬ 
tors and repressors. On the other hand, the contact between 
protein p4 and RNAP through the p4 domain containing 
Argl20 can activate or repress transcription depending on 
the promoter. Although the interaction with the RNAP 
occurs through the same surface of protein p4 at the A3 
and A2c promoters, the proteins are located at different rela¬ 
tive distances in each case: the p4 binding site is centered at 
position —82 in promoter A3, and at position —71 in promo¬ 
ter A2c. In addition, the A2c promoter contains a good 
consensus sequence for o^RNAP at the —35 hexamer, 
whereas promoter A3 does not contain the —35 box. Analy¬ 
sis of mutant promoters in which either the distance 
between the p4 binding site and the transcription start site, 
or the presence or absence of a —35 box was changed, indi¬ 
cated that the position of protein p4 relative to that of RNAP 
does not dictate the outcome of the interaction. Rather, it is 
the absence or presence of a — 35 hexamer that determines 
whether activation or repression occurs. Thus, overstabiliza¬ 
tion of the RNAP by the presence of a consensus -35 
hexamer leads to repression by protein p4, whereas the lack 
of such a sequence gives rise to p4 activation (101). 

The viral double-stranded DNA binding protein p6, 
involved in repression of the early promoter C2, also plays a 
role both in the repression of promoters A2c and A2b and 
in the activation of promoter A3, but only the presence of 


protein p4 (45). The two proteins, p4 and p6, cooperate with 
each other in the binding to the central region of the c(>29 
genome containing the A2c, A2b, and A3 promoters, result¬ 
ing in a ternary p4-p6-DNA complex that affects local DNA 
topology. Through this complex, protein p6 exerts a role in 
the repression of promoter A2c impeding unwinding of the 
DNA strands needed for open complex formation. In con¬ 
trast, protein p6 functions by reinforcing the positioning 
of protein p4 in the repression of promoter A2b and activa¬ 
tion of promoter A3, thereby facilitating transcription regu¬ 
lation (32). It is interesting to point out that a protein p4 
mutant at Argl20, which is not able to interact with the 
RNAP a-subunit, is able to form the p4-p6-DNA complex 
and, thus, to both repress the A2c promoter in the presence 
of protein p6 and to activate the A3 promoter, although in 
this case to a lower extent than that obtained with the wild- 
type p4. Other p4 mutants that are unable to form the p4- 
p6-DNA complex, are also inactive in both activation and 
repression. Thus, formation of the ternary p4-p6-DNA 
complex is a requisite of transcription activation and repres¬ 
sion mediated by both proteins, p4 and p6 (29). 

Taking into account all the results, the switch from early 
to late transcription at the central promoter region contain¬ 
ing the early promoters A2b and A2c and the late promoter 
A3 can be envisioned as follows. Early after infection, when 
proteins p4 and p6 have not yet been synthesized in suffi¬ 
cient amounts, the host RNAP is able to transcribe from the 
early promoters A2b and A2c, but not from the late promo¬ 
ter A3. When both proteins p4 and p6 have been synthesized 
in sufficient amounts, either of two mechanisms can oper¬ 
ate. When only protein p4 binds to a DNA molecule, it stabi¬ 
lizes the binding of RNAP to the late A3 promoter leading to 
transcription activation. Binding of p4 to its site at the late 
promoter excludes the RNAP from the early A2b promoter, 
leading to transcription repression of this promoter. On the 
other hand, binding of protein p4 to site 2 at the early A2c 
promoter cooperates with the binding of RNAP giving rise 
to an overstabilization of the latter, which is unable to leave 
the promoter, leading also to transcription repression. When 
both proteins p4 and p6 bind to the same DNA molecule, 
repression of promoter A2c takes place, because the forma¬ 
tion of the open complex is prevented. On the other hand, 
protein p6 reinforces the positioning of protein p4 at the 
site located upstream from the A3 promoter (site 3), coop¬ 
erating in the repression of promoter A2b and in the activa¬ 
tion of promoter A3 (32). 

Transcription Termination 

In phage <j>29, transcription starting at the late A3 promoter 
and at the early C2 and Cl promoters terminates in a short 
intergenic region between gene 16 and ORF 16.5 (see 
figure 22-1; 4). This DNA region contains an inverted repeat 
and stem-loop structures with calculated free energies of 
— 14.8 and —16.8 kcal for the early and late transcripts, 
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respectively. In both directions a uridine-rich tail follows the 
stem-loop, indicating that it functions as a Rho-independent 
transcriptional terminator, which was named TDI. Similar 
sequences are present at the corresponding positions in 
phages B103 and GA-1. 

Another Rho-independent transcriptional terminator, 
named TA1, was found within gene 4 of <f>29 (4). Thus, some 
transcripts initiated at promoters A2b and A2c could termi¬ 
nate at TA1, and other transcripts would terminate at the 
DNA end. This would result in the synthesis of higher levels 
of messenger RNAs coding for proteins p6 (DBP) and p5 
(SSB) than that of messenger RNAs coding for proteins p6 
to pi, and could explain the high level of synthesis of 
proteins p6 and p5 relative to proteins p4 to pi (reviewed in 
90). Similar TA1 transcriptional terminators are present in 
the B103 and GA-1 DNAs. 

In addition, three potential Rho-independent transcrip¬ 
tional terminators are present at the left part of GA-1 DNA 
(90). The function, if any, of these potential transcriptional 
terminators remains to be elucidated. 


Replication of t|>29 and its Relatives 
Replicative Intermediates 

Two types of replicative intermediates have been observed 
by electron microscopy in 4>29-infected B. subtilis: type I 
molecules, which are double-stranded DNA with single- 
stranded tails coming from one or from the two DNA ends, 
and type II molecules, which are partially double-stranded 
and partially single-stranded (73, 83). Analysis of these 
replicative intermediates showed that replication starts at 
either DNA end, nonsimultaneously, and proceeds towards 
the other end by strand displacement. However, fully 
displaced single-stranded DNA is never found. This, as well 
as other in vivo and in vitro evidence indicates that, before 
replication started at one DNA end reaches the other end, 
replication starts at the latter end, and separation of the two 
displacement forks, when they meet, produces two type II 
molecules. 


Proteins Involved in Replication 

By using ts and sus mutants of phage 4>29 (93,105,139), six 
genes, 1, 2, 3, 5, 6, and 17, were shown to be involved in the 
viral DNA replication (36, 72, 114, 140), although genes 1 
and 17 were partially dispensable, depending on the growth 
conditions (25, 39; see later). More recently, gene 16.7 has 
been characterized, and a supressor-sensitive mutant was 
constructed. In the absence of pl6.7, <f>29 DNA replication 
was delayed, suggesting the involvement of this protein in 
the replication of the viral DNA (92). Genes 2, 3, 5, 6, 16.7, 


and 17 are also present in phages B103 and GA-1, whereas 
gene 1 is only present in phage 4>29. 

In vivo c()29 DNA replication does not require the B. subtilis 
DNA polymerases I or III (113,140). So far, no bacterial gene 
has been shown to be required for <(>29 DNA replication (119). 

Protein-Primed Initiation of DNA Replication 

It is well known that DNA polymerases are not able to initi¬ 
ate DNA synthesis on a DNA template, but require a primer 
containing a free hydroxyl (OH) group to start DNA elonga¬ 
tion (86). In many cases, RNA primers provide the 3'-OH 
group needed by the DNA polymerase to elongate the DNA 
chain. In other cases, the 3'-OH group is created in the DNA 
template, either by the introduction of a specific nick in one 
of the strands of a circular double-stranded DNA, or by the 
formation of hairpin structures at the DNA ends. In the case 
of most linear DNA genomes containing a protein covalently 
linked to their 5' ends (terminal protein, TP), the OH group 
of a specific serine, threonine or tyrosine residue of the TP 
is used by the DNA polymerase for initiation (reviewed in 
120 and 123). In phage c|)29, the initiation of replication 
occurs on the TP-DNA template by deoxynucleotidylation of 
the TP by the DNA polymerase, resulting in the covalent 
linkage of 5'-dAMP through a phosphoester bond to the 
hydroxyl group of Ser232 of the TP (14, 75). This reaction 
requires the formation of a functional heterodimer between 
the TP and the DNA polymerase (13). The DNA polymerase 
active site used for polymerization seems to be used also for 
the TP deoxynucleotidylation reaction (reviewed in 18), 
implying a specific positioning of the TP in the heterodimer 
to allow the DNA polymerase to carry out the reaction. 
Several mutations in different regions of the DNA polymer¬ 
ase, as well as the absence of the N-terminal domain, have 
been shown to affect its interaction with the TP (23, 40, 44, 
99,144,146). 

The 4>29-related phages M2 (89) and Nf (71) belonging to 
group II, and group III phage GA-1 (80), have been shown to 
initiate replication by a similar protein-primed mechanism. 

In addition, the Streptococcus pneumoniae phage Cp-1, the 
Escherichia coli phage PRD1, and adenovirus, also replicate 
by a protein-priming mechanism (reviewed in 120,123). 

Sliding-Back Mechanism 

As already indicated, the DNA ends of (j>29 have an inverted 
terminal repeat of six nucleotides (3'-TTTCAT). Interestingly, 
the linkage of the first dAMP to the TP is not directed by the 
3'-terminal nucleotide, but by the second nucleotide from 
the 3'-DNA end. Then, the TP-dAMP complex slides-back 
one nucleotide, and the second 3'-terminal nucleotide acts 
again as template to direct the incorporation of the second 
dAMP residue (98). For this sliding-back mechanism to take 
place, a reiteration of at least two nucleotides is required. All 
the ())29-related phages have a reiteration of three Ts at the 
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3' ends of their DNAs. indeed, it has been shown that phage 
GA-T also initiates replication at the second nucleotide from 
the 3'-DNAend (80). Since theTP-dNMP covalent complex is 
not a substrate of the 3'-5'exonuclease proofreading activ¬ 
ity of the DNA polymerase (48) the sliding-back mechanism 
would be a way to ensure that the initiation of replication 
occurs with high fidelity. 

A terminal reiteration is also present in other DNAs that 
initiate replication by protein-priming. Thus, the initiation 
site of replication is also an internal nucleotide close to the 
3'-DNA end in the case of phages PRDT (28) and Cp-1 (87), 
and in adenovirus (85). 

Transition from Protein-Primed to 

DNA-Primed Replication 

The <f>29 DNA polymerase and the primer TP do not dissoci¬ 
ate immediately after initiation, nor after the sliding-back 
step. There is a stage in which the DNA polymerase synthe¬ 
sizes a DNA molecule of five nucleotides while complexed 
with the primer TP (initiation mode). Then, during the incor¬ 
poration of nucleotides 6 to 9 the complex undergoes some 
structural change (transition mode), and finally the DNA 
polymerase dissociates from the primer TP when nucleotide 
TO is inserted into the nascent DNA chain (elongation mode) 
(97). These facts probably reflect the requirement of a DNA 
primer of a minimal length by the DNA polymerase to carry 
out DNA elongation efficiently. Interestingly, the 4>29 DNA 
polymerase mutant, in which Asp456 of the conserved 
motif YxDTDS (see later) has been changed into Gly, is 
unable to proceed further than five nucleotides. Thus, resi¬ 
due Asp456 is important to enter into the transition stage of 
4>29 DNA replication (125). A similar transition step takes 
place in adenovirus replication (84), and is probably a 
general feature of protein-primed DNA replication. 

Replication Origins 

The origins of replication of 4>29 and related phages are the 
DNA ends with the corresponding TP (parental TP) cova¬ 
lently linked to them. The first step in the initiation of DNA 
replication is the recognition of the origins by the specific 
TP/DNA polymerase (pol) heterodimer. Thus, 4>29 TP-DNA 
is recognized by the cj>29 TP/pol heterodimer but not by the 
Nf TP/pol one, and Nf TP-DNA is better recognized by the Nf 
TP/pol heterodimer than by the 4>29 TP/pol one. Nonetheless, 
4)29 TP-DNA can be recognized by the hybrid heterodimer 
formed by the Nf TP and <f>29 pol, but not by the one contain¬ 
ing the c))29 TP and Nf pol. Likewise, Nf TP-DNA is recog¬ 
nized by the hybrid formed by 4>29 TP and Nf pol but not by 
the one containing Nf TP and 4)29 pol. Thus, initiation of 
replication can occur when TP-DNA and DNA polymerase 
are from the same phage, indicating that specific recognition 
of origins is brought about through interaction between 
DNA polymerase and parental TP (59). 


On the other hand, the 4>29 TP residues Asn80 and 
Tyr82, which are conserved in the TPs of 4)29-related 
phages, have been shown to be involved in the recruitment 
of the TP/pol heterodimer through interactions between the 
parental and primer TP (81). These residues are located 
before a region, spanning amino acids 84 to T18, that has a 
high probability of forming an amphipathic a-helix, 
conserved in the TPs of 4>29-related phages. Indeed, this 
region of 4>29 TP has been shown to be an important 
element for origin recognition (127). Thus, both primer TP 
and DNA polymerase are involved in interactions with the 
parental TP for origin recognition. 

Blunt-ended DNA fragments containing the left or right 
4)29 DNA ends are active templates for the in vitro initiation 
of replication, although the activity is greatly reduced with 
respect to that of TP-containing DNA (53, 70). The terminal 
12 bp at each 4>29 DNA end were shown to be the minimal 
replication origin (70). On the other hand, efficient initiation 
requires only the terminal repetition 3'-TT. The 3'-terminal 
T, although not used as template, increases the affinity of 
DNA polymerase for the initiation nucleotide (60). Although 
the parental TP is partially dispensable in vitro, no replica¬ 
tion is obtained in vivo after transfection of B. subtilis proto¬ 
plasts with 4>29 DNA molecules lacking one of the two 
parental TPs, in agreement with a key role of the parental 
TP in 4>29 DNA replication (46). 

Essential Proteins Required for 4 >29 DNA 

Replication and In Vitro Amplification 

Genes 2, 3, 5, and 6 of 4>29 are indispensable for in vivo phage 
DNA replication. These four genes are conserved in phages 
B103 and GA-1. As already mentioned, gene 2 codes for the 
DNA polymerase, gene 3 encodes the TP, gene 5 encodes the 
SSB protein, and gene 6 codes for the DBP 

An in vitro <f>29 DNA replication system has been 
obtained using these four purified proteins. Thus, by using 
appropriate amounts of purified TP, DNA polymerase, DBP, 
and SSB, it has been possible to amplify in vitro small 
amounts (0.5 ng) of 4>29 TP-DNA (19,285 bp long) by 3 
orders of magnitude (0.5 pg) after 1 hour of incubation at 30 
°C. The fidelity of the amplified DNA was demonstrated since 
its infectivity, measured as the ability to produce phage 
particles in transfection experiments, was identical to that 
of the natural 4>29 DNA obtained from virions (12). 

Primer TP 

As already mentioned, the primer TP forms an heterodimer 
with the DNA polymerse for the initiation of TP-primed DNA 
replication. Several lines of evidence indicate that the TP 
occupies the double-stranded DNA binding channel in the 
DNA polymerase during initiation of replication (40, 144). 
Heterodimer formation involves several contacts with differ¬ 
ent regions of the 4>29 TP, as it was indicated before to be 
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the case for the 4>29 DNA polymerase. These include an 
internal region spanning amino acids 72 to 80, amino acids 
242 to 262 at the C-end (T54), and the RGD motif located 
at positions 256 to 258 in the 4>29 TP (82). The RGD motif 
is conserved in most (j)29-related phages, but not in 
phage GA-1. 

In 4>29, the 5'-terminal dAMP is linked through a phos- 
phoester bond to the hydroxyl group of Ser232 of the TP 
(75). This amino acid residue is highly specific since a 
mutant (j)29 TP, in which Ser232 was replaced by Thr, 
completely lost its priming activity (55). Residue Ser232 is 
conserved in the TP of phages B103 and GA-1, suggesting 
that it is also used for the covalent linkage of the first 
dAMP in both phages. 

DNA Polymerase 

DNA polymerase is encoded by gene 2 in phages 4>29. B103, 
and GA-1. In 4>29, Nf (belonging to group II), and GA-1, the 
DNA polymerase has been shown to be required for the 
replication of the viral DNA (14, 59, 80, 150). The 4>29 DNA 
polymerase is inhibited by aphidicolin, phosphonoacetic 
acid, and the nucleotide analogs butylanilino-dATP and 
butylphenyl-dGTP, known inhibitors of eukaryotic DNA 
polymerase a. These functional, as well as structural criteria 
have allowed the classification of the (j>29 DNA polymerase 
as belonging to the B-type superfamily of DNA-dependent 
DNA polymerases (also named eukaryotic- or a-like DNA 
polymerases). <f>29 DNA polymerase, which is a monomer of 
about 65 kDa, catalyzes both the initiation and elongation 
stages of DNA synthesis (14,15). Thus, it is able to carry out 
two different reactions: TP-deoxynucleotidylation and DNA 
polymerization. In addition, it has two degradative activities: 
pyrophosphorolysis (19) and 3'—5' exonucleolysis (16, 151). 
Moreover, it has two very important intrinsic properties: 
high processivity and strand displacement activity (10). The 
insertion discrimination values of the (j>29 DNA polymerase 
range from 10 4 to 10 6 , and the efficiency of mismatch elonga¬ 
tion is 10 5 - to 10 h -fold lower than that of a paired terminus 
(48). This high fidelity of 4>29 DNA polymerase is due to its 
3'-5' exonuclease activity (54). 

Site-directed and deletion mutagenesis have shown that 
the ())29 DNA polymerase has a bimodular organization. 
The N-terminal domain contains the 3'-5' exonuclease 
activity, as well as the strand displacement ability, and the 
C-terminal domain contains the 5'—3' synthetic activities, 
initiation and polymerization, as well as the pyrophosphor- 
olytic activity (reviewed in 17,18). 

N-Terminal Domain 

Three conserved regions, named Exol, ExoII, and ExoIII, 
form the 3'-5' exonuclease active site and are evolutionarily 
conserved in prokaryotic and eukaryotic DNA polymerases 
(6). The three Exo motifs contain five invariant residues 


involved in metal binding and 3'-5' exonuclease catalysis 
that, in the case of 4>29 DNA polymerase, are Aspl2 and 
Glul4 in Exol, Asp66 in ExoII, and Tyrl65 and Aspl69 in 
ExoIII (6). Interestingly, mutations in these five amino acid 
residues, which strongly reduced the 3'-5' exonuclease 
activity, were also greatly affected in their strand displace¬ 
ment capacity (49, 135). Another invariant residue, Lysl43 
of 4)29 DNA polymerase, was shown to be important both 
for the catalytic efficiency of the 3'-5' exonuclease activity 
and in its strand displacement capacity (41). 

Other residues in the Exo motifs, conserved in most 
prokaryotic and eukaryotic DNA polymerases, as well as in 
those of phages B103 and GA-1, were also analyzed by site- 
directed mutagenesis. Two of these residues, Thrl5 and 
Asn62, located at the Exol and ExoII motifs, respectively, act 
as single-stranded DNA ligands, having a critical role in the 
stabilization of the frayed primer terminus at the 3'-5' 
exonuclease active site, but they do not affect the strand 
displacement activity (42). In addition, Phe65 of the 
Exo II motif, and residues Serl22 and Leul23, which are 
part of a newly identified motif between motifs ExoII 
and ExoIII, act as single-stranded DNA ligands for 3'-5' 
exonucleolysis (43). 

C-Terminal Domain 

This region of the 4*29 DNA polymerase contains five 
motifs also conserved in other DNA polymerases that 
belong to family B. These motifs are Dx 2 SLYP (motif A or 1), 
Kx.NSxYG (motif B or 2a), YxDTDS (motif C or 3), Tx 2 GR 
(motif 2b), and KxY (motif 4). Mutational analysis showed 
that this domain of the 4>29 DNA polymerase contains the 
polymerization and protein-primed initiation activity, 
containing sites for interaction with the metal activator, 
dNTPs, TP, and DNA. For several DNA polymerases, includ¬ 
ing the 4>29 DNA polymerase, three Asp residues are 
involved in metal binding and catalysis at the polymeriza¬ 
tion active site. These three residues are Asp249 (motif 
Dx 2 SLYP), Asp456, and Asp458 (motif YxDTDS) in the 4>29 
DNA polymerase (7, 21). In addition, 4>29 DNA polymerase 
residue Arg438 (motif Tx 2 GR) plays a role in catalysis of the 
polymerization reaction (99). Three Tyr residues, highly 
conserved in family B DNA polymerases, have been shown 
to be involved in interaction with dNTPs. In the 4>29 DNA 
polymerase these residues are Tyr254 (motif Dx 2 SLYP; 20), 
Tyr390 (motif Kx 3 NSxYG; 20, 22), and Tyr454 (motif 
YxDTDS; 7). Residues Tyr254 and Tyr390 are also involved 
in nucleotide binding selection, thus playing an important 
role in the fidelity of DNA replication (124). In addition, a 
specific change of Tyr254 into Val enables the mutant 4)29 
DNA polymerase to incorporate ribonucleotides without 
affecting its capacity to incorporate dNTPs (24). This indi¬ 
cates that Tyr254 of 4>29 DNA polymerase is responsible 
for the discrimination against the 2'-0H group of an incom¬ 
ing ribonucleotide. Another 4>29 DNA polymerase residue, 
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Lys383, invariant in motif Kx 3 NSxYG of family B DNA 
polymerases, is also involved in dNTP binding (T26). 

On the other hand, eight residues highly conserved in 
family B DNA polymerases were shown to be involved in 
binding template-primer structures in the <f>29 DNA poly¬ 
merase. These residues are Ser252 (motif Dx 2 SLYP; 2T), 
Asn387, Gly391, and Phe93 (motif Kx 3 NSxYG; 22), Thr434 
and Arg438 (motif Tx 2 GR; 99), and Lys498 and Tyr500 
(motif KxY; 23). 

Coordination Between Synthesis and 

Degradation 

A conserved motif, YxGG/A, located between the exo¬ 
nuclease and polymerization domains of family B DNA poly¬ 
merases, is a DNA binding motif that plays an important role 
in the coordination between DNA synthesis and proof¬ 
reading (T45). Moreover, this motif has been shown to be 
important for the formation of a stable complex between TP 
and c()29 DNA polymerase, affecting initiation and transition 
in 4>29 TP-DNA replication (T44). 

A C-terminal deletion derivative of 4)29 DNA polymerase, 
containing the first 188 N-terminal amino acids, was devoid 
of synthetic activities (TP-primed initiation and DNA poly¬ 
merization), but retained some 3'-5' exonuclease activity 
(17). In addition, an N-terminal deletion derivative, contain¬ 
ing amino acids 189 through 575, lacked 3'-5' exonuclease 
activity and strand displacement capacity. It retained some 
polymerization activity, although this was decreased with 
respect to that of the complete polymerase, being distribu¬ 
tive instead of processive. These polymerization defects 
could be related to a strong impairment in DNA binding, 
suggesting that contacts present in the N-terminal domain 
are important for an optimal stabilization and translocation 
of the DNA during polymerization. In addition, the C- 
terminal domain showed a reduced initiation of TP-primed 
DNA replication due to a reduced capacity to interact with 
the primer TP and a lack of activation by protein p6 (146). 

Protein p6 (DBP) 

The double-stranded DNA binding protein p6 (DBP), 
described as a histone-like protein, is able to bind in vitro to 
the whole 4>29 DNA, and a role in genome organization has 
been proposed (68, 129). Protein p6 binds preferentially to 
the 4>29 DNA ends (115,131) through the minor groove (50). 
The main p6 binding sites are located at nucleotides 46 to 68 
and 62 to 125 at the left and right 4>29 DNA ends, respec¬ 
tively. These regions do not have sequence similarity, but 
they contain sequences that are predicted to be bendable 
every 12 bp (131), this feature being the main determinant 
for p6 recognition (130). 

In addition to the role of p6 in transcription regulation 
(see above), its binding to the 4>29 DNA ends activates the 
initiation of DNA replication (11). When protein p6 binds to 


circular DNA it restrains positive supercoiling, supporting 
a model in which a right-handed DNA superhelix wraps 
tightly around a multimeric protein p6 core (132). The DNA 
in the complex with p6 is compacted 4.2-fold. The para¬ 
meters that define the path followed by the DNA in the p6 
complex have been determined: one superhelical turn has 
63 bp with a pitch of 5.1 nm and a diameter of 6.6 nm. Thus, 
the DNA should be bent (66° every 12 bp) and underwound 
(11.5 bp per turn). It is believed that the conformation of the 
nucleoprotein complex would help to open the DNA ends 
facilitating the formation of the TP-dAMP initiation com¬ 
plex (122). 

By site-directed and deletion mutagenesis it has also been 
shown that the N-terminal region of p6 is involved in DNA 
binding. Specifically, mutations in residues Lys2 and Arg6 
produced p6 proteins affected in DNA binding (50). 

Protein p6 was shown to form dimers in solution (110). 
Residues critical for the self-association of the protein, iden¬ 
tified by random mutagenesis, are lie 8 and Ala44. Muta¬ 
tions at these two residues showed, in addition to impaired 
dimer formation ability, reduced DNA binding affinity and 
they were affected in the initiation of DNA replication (1). 
Thus, dimers seem to be the active form of p6 for DNA 
binding. 

Protein p6 binds with a defined phase to the 4>29 DNA 
ends, which is essential for the activation of the initiation of 
DNA replication (132). By using the protein p6 counterparts 
of phages Nf (of the same group as B103) and GA-1, as well as 
theTP and DNA polymerase from these phages, in addition to 
the TP-DNA complexes of 4>29, Nf, and GA-1, it has been 
shown that the activation of the initiation of replication 
requires not only the formation of a specific nucleoprotein 
complex but also its specific recognition by the proteins 
involved in the initiation of DNA replication (51). 

Protein p5 (SSB) 

The 4>29 SSB protein, product of gene 5, is essential for the 
elongation of in vivo viral DNA replication (94). Binding of 
4)29 SSB to the single-stranded DNA produced during 4>29 
DNA replication in vitro has been demonstrated (69). This 
binding stimulates dNTP incorporation during 4>29 DNA 
replication (88), and increases the elongation rate, mainly 
when 4>29 DNA polymerase mutants impaired in strand 
displacement are used (135). This effect is probably due to 
the helix destabilizing activity of the SSB protein (137). 
When 4>29 DNA amplification assays are carried out in the 
absence of SSB, short 4>29 DNA products are produced that 
have a palindromic nature and are caused by DNA polymer¬ 
ase template-switching (12,47). 

4>29 SSB binds single-stranded DNA in a cooperative way 
(K eff = 10 5 M -1 ; w = 50-70), covering 3.4 nucleotides per 
monomer (136). A comparative study of the structural 
complexes formed by the 4>29, Nf, and GA-1 SSB with 
single-stranded DNA has been carried out (57). The SSB of 
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4>29 and Nf are monomers in solution, whereas GA-1 SSB 
behaves as a hexamer. The binding site size of the latter is 
51 nucleotides per hexamer, compared with 3.4 and 4.7 
nucleotides per monomer, respectively, for (j)29 and Nf SSB. 
In addition, the length of circular single-stranded DNA 
was reduced about 6-fold upon GA-1 SSB binding, and only 
2-fold when <j>29 SSB was bound to the DNA. In agreement 
with the structural features, less GA-1 SSB than (j>29 or Nf 
SSB was needed for a similar helix destabilizing activity 
and to stimulate in vitro DNA replication (58). Comparison 
of the amino acid sequence of the three SSB showed that 
GA-1 SSB has an N-terminal addition of about 40 amino 
acids. Deletion analysis has indicated that the region 
comprising amino acids 19 to 26 is essential for GA-1 SSB 
oligomerization (56). 

Other Proteins Involved in DNA Replication 

Protein pi 

Phage <f>29 DNA replication was strongly reduced when 
nonsuppressor B. subtilis cells were infected with mutant 
phage sasl (629) at 37 °C (25, 114). However, infection at 
30 °C reduced phage DNA synthesis to a lesser extent (26). 
Protein pi associates with the cell membrane, the 43 C-term- 
inal amino acids being required for this association (25). In 
addition, protein pi lacking the 33 N-terminal amino acids 
assembled into long protofilaments associated in a highly 
ordered, parallel array forming two-dimensional sheets 
(26). Protein pi has been shown to interact with the TP in 
vitro (27). Altogether, the results suggest that protein pi 
is a component of a virus-associated membrane structure 
which would provide an anchoring site for the phage DNA 
replication machinery. 

As already indicated, phage B103 and GA-1 do not contain 
a corresponding gene 1. However, the deduced protein 
sequence of ORF h of phage B103 shows significant similar¬ 
ity (62.5%) to the 46 C-terminal amino acids of c(>29 protein 
pi. The function, if any, of ORF h of phage B103 remains 
to be determined. 

Protein pl7 

When nonsuppressor B. subtilis cells were infected with the 
4>29 mutant susl7 (112), viral DNA synthesis was reduced 
(36). In addition, infection with the susl7 (112) mutant in 
solid media produced very tiny plaques (93). More recently, 
it has been shown that viral DNA synthesis after susl7 (112) 
infection depends on the multiplicity of infection. At a low 
multiplicity, viral DNA synthesis is strongly reduced, 
whereas a high multiplicity of infection results in a wild- 
type phenotype (39). In addition, a stimulatory effect by 
protein pl7 of in vitro c(>29 DNA amplification was observed 
under conditions of limiting amounts of DNA and initiation 
proteins (39). It has been shown that protein pl7 interacts 


in vitro with protein p6 and stimulates binding of the latter 
to the <f>29 DNA ends (38). 

The gene 17 corresponding to other (j)29-related phages 
is organized differently, with deletions at several regions, 
even in the case of phages that belong to group I, such as 
PZA and (j)15 (5, 109). A comparison of gene 17 of different 
c|>29-related phages has been carried out (111). 

Protein pl6.7 

ORF 16.7, located at the right end of the c(>29 genome, 
encodes a protein of 130 amino acids, pl6.7, which is 
expressed early after infection. Computer-assisted analysis 
showed that the 22 N-terminal amino acids of pl6.7 have 
a very hydrophobic character and may constitute a trans- 
membrane-spanning domain. On the other hand, the 
region spanning amino acids 19 to 60 has a high potential 
to form an a-helical coiled-coil structure and, thus, could 
function as an oligomerization domain. In addition, the C- 
terminal part of pl6.7 (amino acids 70 to 130) shows some 
similarity to DNA binding proteins. Indeed, it has been 
shown that pl6.7 is a membrane protein, and the N-terminal 
transmembrane domain is required for its membrane locali¬ 
zation (92). A variant protein, pl6.7A, was constructed, in 
which the N-terminal membrane anchor domain was 
replaced by a histidine tag. The purified protein was shown 
to form dimers in solution and to bind to single-stranded 
DNA. In fact, it binds in vitro to the displaced strands of (j)29 
replicative intermediates, and it is able to replace the (j)29 
SSB protein in <f>29 DNA amplification assays (128). However, 
unlike the <f>29 SSB, protein pl6.7 does not have helix- 
destabilizing activity. 

To study the in vivo role of protein pl6.7, a (j>29 mutant 
containing a suppressible mutation in gene 16.7 was 
constructed. In vivo phage DNA replication was significantly 
delayed after restrictive infection with the susl6.7 mutant 
(92). It was also found that redistribution of replicating 
phage DNA from the initial replication site to various sites 
surrounding the nucleoid was also dependent on protein 
pl6.7 (91). A model has been proposed in which the main 
role of pl6.7 in vivo involves the attachment of replicating 
c|)29 DNA molecules to the membrane of infected cells (128). 
Homologs of gene 16.7 are present in phages B103 and 
GA-1, suggesting that the proposed role of pl6.7 is conserved 
in this family of phages. 

Phage Morphogenesis 

Phage morphogenesis has been thoroughly studied in (j)29 
(for a review see 2), and a very efficient in vitro system 
for DNA packaging has been established (8, 64). In addition, 
the three-dimensional structure of the (j>29 particle and 
that of the empty proheads have been determined (141). 
Both infection with c|>29 mutants and in vitro assembly 
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have been used to establish the morphogenesis pathway 
of (j>29. No similar studies have been carried out in 
phages BT03 or GA-1. 

Prohead Formation 

The proheads of <j>29, which have the form of a prolate icosa¬ 
hedron, contain 235 copies of the major capsid protein (p8), 
about 180 copies of the scaffolding protein (p 7), 55 dimers of 
the head fiber protein (p8.5), 12 copies of the head-tail 
connector (plO), six copies of the 174-nucleotide-long 
pRNA, and five or six copies of the ATPase (pl6). The major 
capsid proteins are hexameric structures located at each of 
the 3-fold axes and pentameric structures at each of the 5- 
fold axes. The head fibers attach as dimers to the p8 subunits 
at quasi-3-fold axes that relate one pentamer to a pair of 
hexamers (141). Isometric particles are formed in gene 10 
(coding for the connector) mutant-infected cells (33), 
suggesting that the connector is the structure from which 
capsid assembly is initiated (63). 

DNA Packaging Machine 

The head-tail connector is an oligomer with 12-fold 
symmetry (35) whose structure and topology have been 
extensively studied (62, 79,106,133,147,148). The connector 
structure can be divided in three regions: a narrow end, a 
central part, and a wide end. The wide end of the connector, 
which has 12-fold symmetry, is buried inside the prohead, 
whereas the narrow end, with an apparent 6-fold symmetry, 
protrudes from the portal vertex of the prohead. 

Packaging of <j>29 DNA into the prohead, both in vivo and 
in vitro, requires a specific small RNA (pRNA) encoded at the 
left of the genome (65, 143). The pRNA forms a hexameric 
ring structure (67, 76,155) with a diameter similar to that of 
the narrow end of the connector, which results in the super¬ 
position of the pRNA hexamer on the connector, forming a 
double-ring structure (155). This complex is essential for 
DNA packaging; it recognizes <j)29 DNA, and it is probably 
responsible for the specificity of packaging from the left 
DNA end (9). 

Protein pl6 of <f>29 has ATPase activity dependent 
on DNA, on proheads, and on pRNA (61, 66). Five or six 
copies of protein pl6 are required for 4>29 DNA pack¬ 
aging (133). Thus, the structure formed by 12 molecules 
of the connector and six pRNA molecules forms a very effi¬ 
cient DNA-translocating motor that, with the aid of 
the DNA packaging protein pl6 and ATP, actively pumps 
the (j)29 DNA into the prohead (for a review see 74 and 
chapter 6). This motor can work against loads up to 57 pN 
on average, making it one of the strongest molecular 
motors reported to date (134). Phage <j)29 DNA packaging is 
a very energy-consuming process requiring 1 ATP molecule 
to package 2 bp of DNA (66). 


Phage Maturation 

During DNA packaging, the prohead becomes more angular 
and rigid, and the scaffolding protein p7 is released (141). 
After packaging, the ATPase protein pl6 and the pRNA 
molecules are also released from the prohead (66). Then, 
six copies of the lower collar protein (pll), three or four 
copies of the tail protein (p9), and 12 copies of the appen¬ 
dages (dimers of protein pl2* cleaved from pl2 precursor 
molecules) are assembled sequentially (34, 52; for a review 
see 2). For a stable assembly of the tail protein, the nonstruc- 
tural protein pl3 is required (52). The lower collar has, like 
the distal end of the connector, a 6-fold symmetry, which 
may explain the high stability of the neck complex (37). 
Removal of the tail protein results in the release of the DNA 
from the particles, indicating that it functions as a stop for 
DNA exit (34). 


Phage Lysis 

In the three phages (j>29, B103, and GA-1, genes 14 and 15 
encode a holin and a peptidoglycan hydrolase, respectively, 
both of which are required for efficient lysis of the infected 
bacteria (see chapter 10). Lysis is delayed in cells infected 
with a cj>29 snsl4 mutant, giving rise to a larger burst size 
than after infection with wild-type phage (36). Holins are 
membrane proteins that introduce pores in the cell 
membrane, allowing the peptidoglycan hydrolase to exit 
the cytoplasm and hydrolyze the cell wall. The <f>29 holin 
protein (pl4) contains two or three potential transmem¬ 
brane domains (138). It also contains two start codons at 
positions 1 and 3, each with a ribosomal binding site, giving 
rise to two products of 131 and 129 amino acids respectively 
(142). The two proteins have opposite functions, the longer 
product acting as an inhibitor of the shorter one, the lysis 
effector. Cooperative action of the inhibitor and effector 
results in the proper timing of cell lysis. Gene 14 of phage 
B103 also contains the dual start motif, whereas gene 14 
of GA-1 only has the start codon that would give rise to the 
lysis effector. 

The peptidoglycan hydrolases encoded by phages cj>29 
and B103, also named lysozymes, belong to the group of 
muramidases (112, 118). The peptidoglycan hydrolase 
encoded by gene 15 of phage GA-1 only has moderate homol¬ 
ogy with those of <f>29 and B103 (see table 22-1; 90). In fact, 
the gene 15 product of phage GA-1 is more closely related to 
several Bacillus-encoded autolysins belonging to the group 
of amylases. 


Conclusions 

The molecular mechanisms underlying the different stages 
in the development of phage (j>29 have been unraveled 
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through more than 30 years of research. In several respects, 
<f>29 turns out to be a model system in studies of replication, 
regulation of transcription, and phage morphogenesis. In 
replication, the novel protein-primed mechanism has been 
worked out, with the finding of a unique DNA polymerase 
that, by itself, is highly processive and able to produce 
strand displacement. Using the four main replication 
proteins, in vitro (j>29 DNA amplification has been obtained. 
In transcriptional regulation, the concerted action of two 
proteins can operate. One is a sequence-specific regulatory 
protein that acts as an activator or as a repressor, depending 
upon the context of the promoter. The other is a histone¬ 
like protein that cooperates with the regulatory protein, 
both in activation and in repression. In phage morpho¬ 
genesis, a very efficient system of DNA packaging in vitro 
is available, with the interesting finding of a small RNA 
molecule which is required in the packaging event. 
Although much is known regarding the major mechanisms 
used in phage development, there are still some minor 
aspects that remain to be known: for example, the elucida¬ 
tion of the role of several ORFs in phage <f>29 and B103, 
as well as that of the additional information present in 
phage GA-1 that is lacking in phages c|)29 and B103. 
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T he lytic Bacillus subtilis bacteriophage SPP1 was 
isolated in Pavia (Subtilis Phage Pavia) and first des¬ 
cribed by Riva et al. (99). Interest in this phage derived at 
first from the observation that the complementary DNA 
strands of the phage were separable into a heavy strand 
(p = 1.725 g/cm 3 ) and a light strand (p = 1.713 g/cm 5 ) follow¬ 
ing denaturation and isopycnic centrifugation (99, 109). 
This observation and the finding that the phage was 
highly active in transfection led to the use of SPP1 in 
studies of mismatch repair (109). In later experiments the 
molecular biology of the phage, including its structure, 
total DNA sequence, genetics, and life cycle were estab¬ 
lished. Furthermore, SPP1 transfection was a convenient 
tool to study the mechanism of DNA processing during 
uptake into competent cells (123). SPP1 is a generalized 
transducing phage with respect to bacterial host markers 
and it also transduces plasmid DNA. These studies made 
SPP1 one of the best characterized bacteriophages of 
Gram-positive bacteria. In this review we shall summarize 
the results obtained in our and other laboratories during 
the last 35 years. Special attention is given to comparisons 
of the results obtained with SPP1 with properties of other 
organisms, providing a contribution to the understanding 
of the evolution of bacterial viruses. It should be realized 
that all findings reported here were obtained under labora¬ 
tory conditions, which would not necessarily reflect condi¬ 
tions which the phage would encounter in its natural 
habitat. 

General Properties 
Properties of the Virion 

The SPP1 virion has an isometric icosahedral capsid of 
approximately 60 nm diameter attached to a long noncon- 
tractile tail composed of a helical tube (177 nm long) and 


a fiber responsible for host cell attachment (99; figure 23-1). 
SPP1 belongs to the morphotype B1 of the Siphoviridae 
family (1, 18; chapter 2). SPP1 is the prototype of the pro¬ 
posed genus “SPPl-like viruses" that includes phages pl5, 
SF6, and 41c. 

The standard host for SPP1 is B. subtilis strain 168 and 
its derivatives. On this strain and using tryptone-yeast 
(TY) plates (15, 100) SPP1 wild-type produces plaques 
with a clear center of about 2 mm and a halo with a width 
of about 1mm during overnight incubation. Numerous 
mutants affecting the plaque type, temperature sensitivity, 
and growth on suppressor-sensitive strains have been iden¬ 
tified. Phage stocks with titers of 10 10 to 10 11 are readily 
obtained by infection of host bacteria growing in liquid 
TY cultures or in synthetic medium. Phage stocks can be 
concentrated further by precipitation with polyethylene 
glycol. Virions are very stable provided they are main¬ 
tained in the presence of 10 ~ 2 M Mg 2+ , Ca 2+ , or Mn 2+ ions. 
Chelating agents such as citrate, pyrophosphate, EGTA, 
or EDTA rapidly destroy infectivity. Adsorption-resistant 
mutants of B. subtilis 168 have been isolated and the asso¬ 
ciated gene was mapped in the host genome. The nature of 
the bacterial receptor for SPP1 adsorption has not been iden¬ 
tified. Phage SPP1 cannot stably establish itself in infected 
cells as a prophage. Phage stocks, however, contain a frac¬ 
tion of transducing particles with only bacterial DNA, 
which will deliver such DNA to the recipient host bacte¬ 
rium without lysing the cell. SPP1 is sensitive to restriction 
by some of the restriction/modification systems identified 
in B. subtilis (122). Infections of SPP1 and other B. subtilis 
phages do not produce mixed bursts. However, SPP1 forms 
viable hybrid virions in mixed infections with its highly 
related phages pl5, SF6, and 41c (101, 104). Multiple infec¬ 
tion with several SPP1 phages per cell affects neither the 
burst size nor the latent period. Such infections permit 
the performance of genetic crosses. Crosses using a large 
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Figure 23-1 Structure of bacteriophage SPP1. A: Visualization of an SPP1 virion by electron microscopy after negative 
staining with uranyl acetate. The bar represents 50 nm. B: a scheme of the mature phage that compiles our current 
knowledge of the particle structural organization. The icosahedral capsid is formed by the main capsid protein G73P and 
a decoration protein G72P (12). The connector between capsid and phage tail is composed of the portal protein G6P and 
the head completion proteins G75P and G76P (12, 76). The location of the minor capsid protein G7P in the mature phage 
head has not been identified but the protein is known to bind G6P at early stages of SPP1 morphogenesis (45, 115). 

The two main structural proteins of the tail are G77.7P and G77.7*P, which have an identical N-terminus but G77.7*P has 
a higher molecular weight (43). The structure of the tail fiber is drawn based on visual inspection of electron micrographs. 
Its individual protein components have not been defined. 


diversity of mutant phages have led to the establishment 
of a linear genetic map of SPP1 with distances between 
markers given in probabilities of recombination between 
them (110). 

DNA 

SPP1 DNA can be readily obtained and purified by con¬ 
ventional methods. SPP1 wild type DNA has a contour 
length of about 15 pm. The molecules are partially circu¬ 
larly permuted and have a terminal redundancy of 4% 
(89, 119). In most of our previous communications, as well 
as in figures 23-2 and 23-3, SPP1 DNA is represented 
as the ensemble of ordered EcoRI restriction fragments 


(13, 97, 104). DNA of EcoRI fragment 13 and of the major 
part of EcoRI fragment 1 are dispensable following the iden¬ 
tification of deletions in these regions. A derivative of SPP1 
with an inversion of some 8.4 kb, extending from within 
EcoRI restriction fragment 1 to restriction fragment 5, was 
generated during a cloning experiment (41). SPP1 DNA 
does not contain any unusual bases. Properties of the 
DNA determined by physicochemical or chemical methods 
were refined and confirmed following the establishment of 
the total base sequence of SPP1 DNA (3). According to 
this sequence the SPP1 genome has a size of 44,007 bp 
and a base composition of 43.7% dG + dC. The compiled 
sequence of SPP1 DNA is available under EMBL accession 
number X97918. 
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Figure 23-2 Physical and genetic map of the genome of bacteriophage SPP1. The genome is presented in a circular form as 
found in the cell at early times after infection. The inner circle represents scale from 1 to 44,007 base pairs (3). Digestion of 
the DNA with EcoRI generates 16 fragments as indicated by lines and numbers in the thick bar circle. The packaging initiation 
seguence pacLCR (see section “Morphogenesis”) is shown as a black box indicated by a vertical arrow. The direction of 
packaging is shown by a dotted arrow. The white rhombi within the EcoRI-4 and EcoRI-3 fragments denote the or/'L and ori R 
sequences, respectively (see section “DNA Replication”). Segments of the genome dispensable for SPP1 multiplication are 
shown in white frames. The region coding for genes expressed early during infection is shown in dark grey and the region 
coding for late genes in light grey. The five early promoters (PEI -PE5) are indicated by filled circles in the outer periphery of 
the map. Arrows show the direction of transcription by ct a -RNAP and arrowheads the position of rho-independent 
transcription terminators. The late ct a -RNAP promoters (PL1-PL5; open circles) and their transcripts are represented using 
the same symbols. The operons coding for genes involved in DNA replication, DNA packaging, and virion morphogenesis, 
as well as the regions of the genome not characterized yet, are indicated on the outside. See thebacteriophages.org/ 
frames_0230.htm for color version of this figure. 


Proteins 

The proteins encoded by SPP1 were first analyzed by 
SDS polyacrylamide gel electrophoresis following disrup¬ 
tion of radioactively labeled phage particles or after lysis 
of SPP1 infected B. subtilis cells (51). The proteins identified 
could be subdivided into three groups on the basis of 
their synthesis during the latent period. Such analyses were 
complicated by the fact that SPP1 infection does not inter¬ 
fere with host protein synthesis. Also, neither the irradia¬ 
tion of cells prior to infection nor the addition of drugs 
that were anticipated to selectively inhibit host protein 
synthesis provided conditions for the study of phage-coded 
protein synthesis (49). To eliminate the background of 
host protein synthesis, further characterization of SPP1 
proteins included the expression of SPP1 genes in infected 


minicells of B. subtilis (83) or in Escherichia coli infected 
with 7./SPP1 hybrid phages which, together, contained all 
clonable EcoRI restriction fragments of SPP1 DNA (4, 5). 
The set of proteins potentially encoded by SPP1 is shown 
in figure 23-3 (3). These include reading frames (ORFs) 
coding for functionally identified proteins, for proteins 
which are dispensable, and for unidentified proteins that 
remain to be characterized. 

Life Cycle 

Adsorption and Infection 

SPP1 infection is initiated by adsorption of the phage 
to B. subtilis 168. As is the case for numerous other 
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Figure 23-3 The bacteriophage SPP1 mature chromosome. The two main bars denote the mature phage sequence (1 to 
~45.9 kb). The upper border of the bar represents the H (heavy) strand and the lower line the L (light) strand. Also indicated 
are the 16 EcoRI DNA fragments (numbered in order of decreasing size), the dispensable, early and late gene regions, the 
early (P E ) and late (P L ), and the origins of replication or/ L and or/ R (symbols are as in figure 23-2). Fragment EcoRI-13t has one 
end generated by EcoRI digestion and the other by cleavage at the poc site by the terminase enzyme. The poc region is 
located within EcoRI DNA fragment 1 (uncleaved: pocLCR, filled black box indicated by a vertical arrow). The cleaved poc 
site (pocCR, filled black oval, indicated by a vertical arrow) defines the first nucleotide of the SPP1 sequence. To present the 
full EcoRI DNA fragment 1, the first 718 base pairs (defining the EcoRI-13t fragment) are shown again in the upper main bar 
after position 44007 (broken line). Therefore, gene 7 is also presented twice. Due to imprecision of the headful cut 
mechanism, the end of the mature SPP1 chromosome (~45.9kb) is not precisely defined. The numbered hatched boxes 
denote phage-encoded products that have been genetically or biochemically characterized, whereas numbered but 
otherwise empty boxes indicate those of unknown function (3). See thebacteriophages.org/frames_0230.htm for color 
version of this figure. 


bacteriophages, two different steps can be distinguished in 
SPP1 adsorption. The initial interaction of SPP1 with the 
host cell surface is reversible and infective virions can be 
detached from the host. Lipoteichoic acids, which are 
components of the Gram-positive cell wall, play a role in 
this step (101, M. A. Santos, personal communication). 
The second step is the irreversible binding of SPP1 to the 
host bacterium followed by DNA ejection. Host mutations 
preventing irreversible adsorption were mapped by cotrans¬ 
duction experiments to the locus pha-2, which is linked 
to ald-1 in the B. subtilis chromosome (position 3277 kb; 71) 
(102). The SPP1 receptor was recently shown to be the 
transmembrane protein YueB (see note added in proof on 
p. 345). Mutations in plia-2 confer resistance uniquely to 
infection by SPP1 and to its close relatives pl5, SF6, and 
41c. By contrast, resistance to other B. subtilis phages is 
associated with defects in the glucosylation of teichoic acids 
(gta mutations: 135). SPP1 and its relatives are some of the 
few phages known to infect gta strains (67). The SPPl-like 
phages thus appear to have developed a group-specific 
mechanism for delivery of DNA to the B. subtilis cytoplasm. 

At low multiplicity of infection, >90% of the phage 
DNA is transferred to the host cell interior within 10 minutes 
of phage-bacteria incubation (136). The phage chromo¬ 
some becomes accessible to DNase I attack during its 


transfer to the host cell, demonstrating that it is exposed 
to the exterior medium at a stage of the transfer process 
(103,136). 

Delivery of the SPP1 chromosome to the B. subtilis 
cytoplasm leads to the successful recruitment of the host 
cell machineries necessary for high-level expression of 
the viral genes, to replicate the SPP1 DNA, and to support 
assembly of virions at high efficiency. This occurs in absence 
of host gene-expression shutoff (49). 

Transcription 

Transcription of SPP1 DNA is asymmetric and there¬ 
fore unidirectional (figure 23-2; 50, 88, 98). Of the two 
separable strands of SPP1 DNA only the heavy strand, with 
58.3% purine, is transcribed. This is, however, not an inher¬ 
ent requirement for the viability of SPP1: In the SPP1 
mutant described before in which the entire region between 
EcoRI restriction fragments 1 and 5 has been inverted, 
transcription of early genes must occur from the DNA light 
strand (41). 

In vivo and in vitro transcription studies located five 
early (PEI to PE5) and five late (PL1 to PL5) promoters in the 
SPP1 genome (32, 33, 86, 95, 116, 117; figures 23.2, 23.3). 
The messenger RNAs are polycistronic. ct a -RNA polymerase 
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(ct^-RNAP) is utilized for the transcription of early SPP1 
promoters (84). Similarly to strongly transcribed house¬ 
keeping genes, the five early promoters (PEI-PE 5) have 
upstream of their consensus sequences an alternating 
poly-purine poly-pyrimidine (Pur.Pyr) region and a very 
high dA + dT content (AT-rich) (3,63). 

PEI maps upstream of orf45.1 (defining a nonessential 
operon), PE2 upstream of orf37.3 (replication initiation 
operon), PE 3 upstream of orf34 (replication accessory pro¬ 
teins operon), and PE4 and PE 5 upstream of or/33 and of 
orf27, respectively (uncharacterized operons) (33, 95, 117; 
figures 23.2, 23.3). Transcription originating from PEI and 
PE4 yields products that are dispensable for phage growth 
under laboratory conditions. Furthermore, some of the 
products read from the PE2 and PE 5 promoters are also 
dispensable. Transcription from the PEI promoter yields 
a messenger RNA that allows expression of orf46 to or/53, 
but SPP1 mutants with a deletion in this segment do not 
appear to be in any way defective, provided transcrip¬ 
tion read-through into the late operon does not occur (see 
33). Deletion of the central part of or/33, that is, the only 
product transcribed from PE4, does not affect the phage 
physiology (41; figures 23.2, 23.3). 

The early promoters show a hierarchy of signal strength: 
when the promoter strength of PE2 is considered as 1, PE 3 
is 1.5—1.9 times stronger, whereas the strength of PEI, PE4, 
and PE5 is 0.8-, 0.6-, and 0.5-fold respectively that of PE2 
(33, 116, 117). The presence of additional weak promoters 
located within the genes cannot be excluded. 

The late genes, whose expression requires preceding 
phage-DNA synthesis (49), are defined by the five late oper¬ 
ons [gene 1-7 operon (read from PL1 and PL2), gene 8-9 
operon (PL3), gene 11-17.3 operon (PL4) and gene 17.5-26 
(PL 5)] (31-33). SPP1 late transcription, in addition to a 
replicating DNA substrate, requires an uncharacterized 
phage-encoded product (31). Unlike the case of phage X, in 
which late transcription is activated by allowing the host 
RNAP to proceed through termination sites, the SPP1 
transcriptional activator promotes late transcription by a 
different but uncharacterized mechanism (see 3). The best- 
characterized SPP1 late promoters are PL1 and PL2 (31-33). 
The PL1 promoter, which accounts for about 95% of the 
transcripts of the gene 1-7 operon, lacks the —35 ct a -RNAP 
consensus region, whereas the weak PL2 promoter, whose 
initiation site maps 35 nucleotides downstream of the 
PL1 start site, has all the features present in a vegetative 
B. subtilis promoter (32). Furthermore, transcription within 
the gene 1 operon, which is read from the PL1 and PL2 
promoters, is shut off by the terminase enzyme at late times 
(18 minutes) after infection (31). 

DNA Replication 

Replication of the SPP1 DNA is mediated at least partly 
by the host DNA polymerase III. This follows from the 


observation that SPP1 DNA synthesis is sensitive to 
the drug HPUra (49) and that mutations are induced in 
SPP1 when the phage grows on host cells with a mutator 
polymerase III (87). 

SPP1 packs its double-stranded (ds) DNA into an 
empty procapsid by a processive headful mechanism, using 
a linear head-to-tail concatemer as a substrate (30, 32, 59, 
89). SPP1 circular molecules were detected in infected 
cells, but branched replication intermediates have not 
been observed (24, 56, 80, 81). Furthermore, a block in 
DNA replication does not reveal the accumulation of 
head-to-tail concatemers (24, 81). Initiation of ring-to-ring 
replication with the subsequent shift via recombination- 
dependent DNA replication (RDR) leads to the generation 
of concatemers that are the substrate for DNA packaging 
(see 107,131). 

Analysis of SPP1 conditional-lethal mutants for their 
capacity to synthesize phage DNA has led to the identifica¬ 
tion of five different complementation groups. Mutants 
in genes 38, 39, and 40 show a block in DNA replication, 
whereas mutants in genes 34.1 and 35 show a normal 
initiation but a replication arrest phenotype (24, 95, 132). 
Accumulation of ring-to-ring (theta replication) SPP1 
replication intermediates has not been observed in SPP1- 
infected cells (56,80,81). 

Initiation of SPP1 Theta Replication 

SPP1 ring-to-ring replication begins at a “unique” origin 
and proceeds unidirectionally (56, 81). Previously it has 
been shown that G38P,* G39P and G40P are the only SPP1- 
encoded functions necessary and sufficient to drive theta 
replication from the cis-acting oriL region in an otherwise 
nonreplicative element in B. subtilis cells (85). An SPP1 
sequence termed oriR has also been located 12 kb away 
from oriL in the SPP1 circular map (24, 56, 81, 85, 95; 
figures 23.2, 23.3). In vitro studies have revealed that multi¬ 
ple copies of G38P bound to its cognate site induce local 
unwinding of the adjacent AT-rich sequence present within 
oriL ororiR (85; figure 23-4). 

Initiation of SPP1 DNA replication strictly requires 
G38P, G39P. and G40P as well as the host DNA primase 
(DnaG), DNA Pol III and topoisomerases of types I and II. 
In the absence of ATP, multiple copies of monomeric 
G38P, bound to its cognate sites on a supercoiled sub¬ 
strate, induce local unwinding of the adjacent AT-rich 
sequence, leading to open complex formation (6, 7, 85, 95). 
The single-stranded binding protein (SSB-G36P) binds to 
the ssDNA at the open complex (figure 23-4). G39P is 
a natural partially unfolded protein that is folded upon 

* The designations GXP (gene X product) and gpX (gene 

product X) are used for the same SPP1 protein. We follow 

the first terminology in this review. 





Figure 23-4 Model for phage SPP1 initiation of DNA replication. A: Model for SPP1 initiation of theta type DNA replication. 
First, C38P recognizes AB boxes of the SPP1 replication origin (or/L), and binds to them in an ATP-independent fashion. 
Binding of C38P leads to melting of the AT-rich region adjacent to the AB boxes. The helix destabilizer, G36P, binds to the 
signle-stranded DNA. The G39P-G40P-ATP complex is loaded in the unwound region by the interactions between G39P and 
G38P and the ATP-dependent single-strand DNA binding capacity of G40P. The interaction of G39P with G38P, which form a 
G38P-G39P heterodimer, dissociates G39P from the G39P-G40P-ATP complex and the helicase is free of the inhibitory effect 
of its loader with subsequent release of the G39P monomers and homodimers and G38P-G39P heterodimers from the 
nucleoprotein complex. Both DnaG and DNA pol III upon interacting with G40P are then loaded at the origin. The hydrolysis 
of ATP would then produce the unwinding of the DNA by G40P and its translocation. B: Roadblock as a model for the shift 
from theta to sigma replication. G38P bound to or/L or or/R blocks replication fork progression. A nick in the leading strand 
(bottom strand) will be processed by the putative 5'-3' exonuclease, G34. 1 P, to generate a 3' single-strand DNA tail on which 
G35P will polymerize. A nick in the lagging strand (top strand) will not require further processing, and G35P will polymerise 
on it. C: Model for SPP1 initiation of sigma type DNA replication. Binding of G38P, with the help of G36P, stabilizes the 
melted AT-rich region adjacent to the AB boxes at either or/L or or/R. A G35P-single-strand DNA filament pairs with the 
complementary strand of the unwound region to form a D-loop. The G39P-G40P-ATP complex is then loaded in the 
unwound region by the interactions between G39P and G38P and the ATP-dependent single-stranded DNA binding capacity 
of G40P. The interaction of G39P with G38P, leading to G39P monomers and homodimers and G38P-G39P heterodimers, 
dissociates G39P from the G39P-G40P-ATP complex. G40P helps the assembly of both DnaG and DNA polymerase III at the 
AT-rich region. The 3'-OFI end of the paired strand could be used to prime the leading strand and DnaG could provide 
the primer for lagging-strand synthesis. See thebacteriophages.org/frames_0230.htm for color version of this figure. 
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interaction with G40P-ATP and which shuts off all activi¬ 
ties associated with the latter (e.g., ATPase, helicase, ssDNA 
binder) (7, 9). G40P-ATP DNA helicase, which is a hexamer 
in solution, interacts with G39P (7, 10). Both proteins form 
a heterododecamer (G39P 6 -G40P 6 -ATP) in solution. 
G39P of the G39P 6 -G40P 6 -ATP complex interacts with 
G38P bound to oriL, loads G40P 6 -ATP onto the ssDNA at 
the open complex, and forms a heterodimer with G38P 
(figure 23-4). Then, the G38P-G39P complex is released 
from the replication complex (7). G40P-ATP released from 
the G39P 6 -G40P 6 -ATP complex is “activated”. G40P 6 - 
ATP is stabilized on the ssDNA upon interaction with SSB- 
G36P bound to the complementary strand and interacts 
with the host-encoded monomeric DnaG (6) and with the 
r subunit of DNA Pol III (79), and SPP1 theta type DNA 
replication initiates (see figure 23-4). 

Initiation of SPP1 Sigma Replication 

The molecular events that trigger the synthesis of head- 
to-tail concatemeric SPP1 DNA are not well understood. 
The picture of how RDR proceeds has changed over the 
last decade (35, 72, 78, 131). SPP1 replication fork re-start, 
which leads to the accumulation of SPP1 concatemers, is 
independent of host-encoded RecA, AddAB, and RecF 
(counterpart of E. coli RecA, RecBCD, and RecF, respectively) 
recombination proteins, of the replicative DNA helicase and 
its loader (DnaC and Dnal, respectively), and of compo¬ 
nents of the B. subtilis primosome (e.g., PriA, DnaB, DnaD) 
(2,8,80,95,132). 

Genetic data suggest that the generation of linear con¬ 
catemeric SPP1 DNA requires the phage-encoded G35P 
and G34.1P proteins. Sedimentation studies of DNA syn¬ 
thesized by SPPl£s220E (impaired in G34.1P) or SPPl£sI27 
(impaired in gene 35) infected cells, at the restrictive tem¬ 
perature, revealed that only a small percentage of the 
phage DNA can be recovered in a fast sedimenting form 
(concatemeric DNA). In both cases less-than-unit-length 
SPP1 genomes (30-35 kb in size) accumulated (24). Con¬ 
sidering the unidirectional movement of the SPP1 replica¬ 
tion fork we can assume that any event initiated at oriL 
that stops at oriR will generate 32 kb replication intermedi¬ 
ates. G35P binds and filaments on ssDNA, catalyzes, in 
an ATP-independent manner, joint molecule formation 
between a 3'-ssDNA and a homologous AT-rich region of 
oriL on a supercoiled molecule, and specifically interacts 
with the G40P DNA helicase and G36P SSB proteins (8). 
G34.1P shares 18% identity with the ATP-independent 
5' —> 3' exonucleases Rac-encoded RecE product (also 
termed ExoVIII). After Skalka (107), Formosa and Alberts 
(54), Viret et al. (131), and Kuzminov (72) we hypothe¬ 
size that after the initial phase of initiation of theta replica¬ 
tion at oriL, the progression of the replication fork might 
be stalled when it encounters G38P bound at the inverse¬ 
ly oriented oriR (roadblock) in the absence of overt DNA 


damage (85, 95; figure 23-4B), or at any region in the 
presence of DNA damage. The former claim is consistent 
with the observation that progression of a replicating 
fork in vivo is transiently stalled when the replisome 
approaches an inversely oriented silent origin on a plas¬ 
mid (see 129). The stalled replication fork breaks and the 
broken fork is rescued by a process dependent on phage- 
encoded G34.1P and G35P functions since such a defect 
cannot be overcome by any of the host recombination 
and/or PriA-dependent replication functions (2, 8, 80, 
95, 132). The G34.2P exonuclease may generate a duplex 
DNA with a 3' terminus (Martlnez-Jimenez, personal 
communication; figure 23-4B). G35P-mediated joint mole¬ 
cule formation could provide a 3'- end to anneal at oriR on 
a second supercoiled SPP1 molecule (figure 23-4B; 8). A 
left-handed G35P filament interacts with G40P-ATP and 
G36P. G40P-ATP bound at the ssDNA region free of G36P 
directs the assembly of DnaG and DNA PolIII (6, 79). The 
3'-0H end of the annealed strand might act as a primer 
for initiating concatemeric DNA synthesis (sigma replica¬ 
tion) on a supercoiled template. Furthermore, G38P bound 
to oriR might interact with the G39P-G40P-ATP complex 
to enhance the loading of the DNA helicase at the D-loop 
region. In the case of phage X sigma replication, it was 
suggested that the X-0 protein bound to the X-ori might 
create a physical barrier that permits only one round 
of unidirectional theta replication both in vivo and in vitro 
(11, 42). The broken fork could be rescued either by the 
concerted action of Red products (X-a and E-(3 proteins, 
“functional counterpart” of SPP1 G35P and G34.2P) or by 
the host recombination and repair machinery (see 8, 107, 
112). This is consistent with the fact that the Red genes are 
not essential, though involved in X replication (107, 112). 
Since X and SPP1 sigma replication models are alike, we 
suggested that G34.2P-G35P (in the case of phage SPP1) 
and the Red and/or the host recombination system (in 
the case of phage A,; 107) promote the formation of new 
replication forks, to rescue the broken replication forks, 
by the invasion of duplex DNA by a ssDNA 3' terminus on a 
supercoiled homolog. The invading 3'-OH end then might be 
used to prime sigma replication. 

Morphogenesis 

Bacteriophage SPP1 devotes more than 40% of its 
genetic information to the synthesis and assembly of struc¬ 
tural components (figures 23.2, 23.3). The phage capsid 
and the tail are formed in independent assembly path¬ 
ways. The sequence of reactions that yields the DNA-filled 
capsid was studied in enough detail to recommend SPP1 
as one of the model systems for viral capsid assembly 
(figure 23-5). In contrast, the tail assembly remains to be 
characterized in detail. The DNA-filled capsid is composed 
of six viral proteins (G6P, G7P, G22P, G13P, G15P, and G16P) 
and three non-structural SPP1 proteins participate in its 
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Figure 23-5 Morphogenetic pathway of bacteriophage SPP1. A: The first stable intermediate in phage assembly is 
the procapsid. It contains four proteins: major head protein, G73P (gray shell); scaffolding protein G7 7P (small circles that fill 
the inner space of the structure); and the portal protein, G6P, associated with the minor head protein, G7P, which are 
localized at the portal vertex of the procapsid (white structure). The terminase complex (G7P-G2P) is represented by gray 
ovals and a white sguare. After interaction of the terminase-DNA complex with the portal vertex the DNA (thin line) is 
pumped into the head, the scaffolding protein is released, and capsid expansion occurs. Termination of DNA packaging 
requires headful cleavage of the DNA concatemer and release of the terminase-DNA concatemer from the portal vertex. The 
DNA-filled capsid is stabilized by binding of G75P and G76P to the portal vertex. Attachment to the portal vertex of the 
phage tail that is assembled in an independent assembly pathway completes SPP1 assembly. B: The micrograph shows stable 
assembly intermediates of the SPP1 morphogenetic pathway and its final product, the virion. Abbreviation include: pc, 
procapsid; fc, DNA-filled capsid; ip, infective phage. The bar represents lOOnm. Negative staining with uranyl acetate. 

See thebacteriophages.org/frames_0230.htm for color version of this figure. 


assembly (G1P, G2P, and G11P). The tail has at least eight 
different proteins as identified in SDS-PAGE patterns 
(51; Droge and Stiege, personal communication). The role 
of host proteins in assembly has not been studied. 

Procapsid Assembly 

The first detectable intermediates of capsid assembly are 
spherically shaped procapsids with an outer diameter of 
approximately 55 nm. These are composed of the portal 
protein G6P (57.3 kDa), the minor capsid protein G7P 
(35.1 kDa), the scaffolding protein G11P (23.5 kDa), and the 
major capsid protein G13P (35.4 kDa) (12, 45; figure 23-5). 
The procapsid icosahedral lattice is formed by the major 
capsid protein G13P, displaying most probably T = 7 
symmetry (45). The interior of the procapsid is “filled” with 
100-180 copies of the scaffolding protein G12P. Co¬ 
production of G21P and G23P yields procapsid-like struc¬ 
tures both in B. subtilis infected with SPP1 DNA-packaging 
mutants or when genes 22 and 13 are coexpressed in a 
heterologous host (12, 45). However, no interaction is 
observed between the two proteins when produced inde¬ 
pendently and then mixed in vitro (12, 45). G23P alone 


polymerizes into curvilinear structures, most frequently 
with a spiral-like shape. Presence of G22P directs the poly¬ 
merization of G23P to the correct geometry required for 
formation of closed protein lattices (12). The organization 
of G22P inside the procapsid is not known, but it was 
shown to adopt a polymorphic oligomeric state in solution. 
This organization is unique for the SPP1 scaffolding pro¬ 
tein when compared with functional analogs from other 
species (12). 

The procapsid-like structures composed of G21P and 
G23P are a mix of particles with the normal procapsid 
size (T = 7) and others with a smaller dimension (T = 3 or 
4). None of these are competent for viral DNA encapsi- 
dation (45). The third component required for biological 
activity of the SPP1 procapsid is the portal protein G6P 
that defines a specialized vertex for chromosome packag¬ 
ing (45, 76). Portal proteins are cyclical oligomers with a 
turbine-like shape. Such structures are ubiquitous among 
tailed bacteriophages and herpesviruses (figure 23-6; 91, 
93, 106, 126). Presence of G6P leads to the formation of 
procapsids with the normal size showing that the portal 
protein influences the copolymerization of G22P and G23P, 
preventing assembly of small procapsids (45). This effect 
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Figure 23-6 Three-dimensional structures of the bacteriophage SPP1 portal protein G6P and of the connector. The structures 
are based on data from electron micrographs of G6P oligomers and connectors imaged under the electron microscope and 
processed by single particle analysis (94). A: Different views of the G6P cyclical 13-mer and a cut-open view along the 
symmetry axis of the molecule (bottom right). In B: Equivalent views of the connector composed of stacked rings of G6P, 
G75P, and G76P. Each ring is composed of 12 subunits. The wide region of the connector formed by G6P is oriented toward 
the interior of the viral capsid while the bottom is exposed to the phage tail. The connector internal channel that can 
accommodate a DNA molecule is closed at the level of G76P (bottom ring) preventing exit of DNA from the phage capsid 
(Courtesy of E. V. Orlova). See thebacteriophages.org/frames_0230.htm for color version of this figure. 


implies that the portal protein must be integrated in the 
procapsid structure at an early stage since the decision 
between formation of a T = 7 procapsid or of a structure 
with a smaller size is made after the first round of hexa- 
mers is assembled around the initial procapsid vertex (121). 
G6P is most likely a component of the initiator complex of 
SPP1 procapsid assembly, but it does not act as a nucle- 
ator of the reaction because the rate of assembly in vivo 
is independent of its presence (45). The organization 
of the initiator complex is not known. A stable interaction 
between G6P, G21P and/or G13P is detected only when the 
three proteins are coproduced, suggesting that the effi¬ 
cient incorporation of G6P into the procapsid requires a 
structural context created by G11P and G13P (45). Single 
amino acid substitutions in G6P that block its incorpora¬ 
tion in the procapsid structure cluster in two distinct seg¬ 
ments of the G6P primary structure (66). These mutations 
identify regions in the portal protein that are required 
either for G6P association into cyclical oligomers or for 
interaction with G21P and/or with G23P during procapsid 
assembly (66). 

The portal protein G6P is present as a cyclical 12-mer 
in the SPP1 virion (76). However, G6P alone is organized 
as a cyclical oligomer with 13-fold symmetry in solution 
and in three-dimensional crystals (46, 68, 69). Such sym¬ 
metry is an intrinsic property of the protein that is 
maintained upon dissociation-reassociation (68) and upon 
denaturation followed by refolding and reassociation 
(Jekow and Orlova, personal communication). The archi¬ 
tecture of the G6P 13-mer (93; figure 23-6) is remarkably 
similar to that of portal proteins of other species in spite 
of there being no detectable relationship between their 


amino acid sequences (126). Lurz et al. (76) argued that the 
13-mer is an on-pathway precursor of the phage portal oligo¬ 
mer and that G6P acquires a 12-mer organization in the 
initiation complex of procapsid assembly. The G6P forms 
competent for assembly of this complex would be gp6 open 
curvilinear oligomers (.. .8-, 9-, 10-, 11-, 12-mers) found in 
equilibrium with the closed 13-mer ring (127). The interac¬ 
tion with the other procapsid proteins, G22P and G23P. 
would impose an increased bend between gp6 subunits, 
locking them in a 12-mer ring (76). Under these conditions 
procapsid assembly, an irreversible process, displaces the 
equilibrium of G6P toward dissociation of the 13-mer ring, 
feeding additional subunits and curvilinear oligomers for 
procapsid assembly. 

The major capsid protein, the scaffolding protein, and 
the portal protein are the common requirements to assem¬ 
ble procapsids competent for DNA packaging in all tailed 
bacteriophages (phage HK097 and possibly the Sfi21-like 
group of phages are exceptions; see 22, 62). These are 
also the only essential components for assembly of SPP1 
biologically active procapsids (45). The fourth component of 
the SPP1 procapsid is G7P present in two to three copies 
per particle (12, 43, 115). In the absence of G7P, viable 
phages are still assembled in an in vitro DNA packag¬ 
ing assay. However, the procapsid biological activity is 
reduced 5 to 10-fold (45). G7P forms stable complexes with 
G6P in vivo and in vitro and is most likely localized in 
the portal vertex of the procapsid (115). Formation of this 
complex seems essential for G7P to accomplish its function, 
but the precise role of G7P in the SPP1 life cycle remains to be 
determined. The G6P-G7P interaction is disrupted during 
assembly of the virion, probably when DNA is translocated 
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Figure 23-7 The pac region of bacteriophage SPP1. The 
black bar indicates 399 base pairs of the SPP1 sequence 
containing the packaging initiation region. The enlarged 
regions, pocL, and pocR, are recognized by the terminase 
small subunit (G7P) and are the nonencapsidated left end 
and the encapsidated right end, respectively, of the SPP1 
sequence. pacC is the packaging processing site. The open 
boxes labeled a, b, and c indicate repeated sequences in 
the pac region. The arrows indicate the staggered nicks in 
the 10 bp repeat boxes within pacC introduced by the 
terminase large subunit (G2P). See thebacteriophages.org/ 
frames_0230.htm for color version of this figure. 

through the portal pore to the capsid interior (115). Proteins 
with these properties were not characterized in other virus 
systems. However, the finding of genes whose products share 
similarity with G7P in several bacteriophage species of 
Gram-positive bacteria (75, 115, 133 and references therein) 
indicates that this protein has a widespread role in a variety 
of phage systems. 

DNA Packaging 

SPP1 uses concatemers of its genome as the substrate for 
packaging by a headful mechanism (see chapter 6). Encap- 
sidation is initiated by recognition and cleavage at the 
nucleotide sequence pac followed by unidirectional translo¬ 
cation of concatemer DNA to the interior of the preformed 
procapsid (16, 20, 29, 32, 59). When a threshold amount of 
DNA, representing about 104% of the SPP1 genome, has 
been packaged (headful) a sequence-independent DNA 
cleavage terminates the encapsidation cycle (89, 120). A 
second cycle of encapsidation initiates at the DNA extremity 
generated by this cut and sequential headful packaging 
events proceed along the DNA concatemer in a proces- 
sive fashion (89, 119). The headful packaging mechanism 
generates a heterogeneous population of terminally redun¬ 
dant and partially circularly permuted DNA molecules as 
in the cases of PI and P22 or of T4 whose DNA is totally 
permuted and terminally redundant (16,17, 55). 

The Terminase and Initiation of 

DNA Packaging 

Initiation of phage DNA packaging involves the specific 
interaction of the procapsid with virus DNA. This process 


is mediated by viral nonstructural proteins, called termi- 
nases, which specifically recognize viral DNA and cleave it 
(reviewed in 16, 55, 90). The SPP1 terminase is composed 
of two subunits that achieve these two functions in the 
absence of other phage products (32, 59). 

The SPP1 terminase enzyme hetero-oligomers are 
composed of decameric G2P and monomeric G2P (28, 30, 
58, 59). The terminase small subunit, G1P, is a 183 amino 
acid long polypeptide that recognizes specifically the bipar¬ 
tite packaging initiation site [pacL (nonencapsidated 
DNA end or left end) and pacR (the end to be encapsi¬ 
dated or right end); figure 23-7]. G1P. upon interacting 
with pacL and pacR, forms a nucleoprotein structure that 
helps to position the terminase large subunit, G2P, at the 
packaging processing site, pacC (32, 37) and interacts with 
G6P (25). G2P, which is a 422 amino acid polypeptide, 
introduces two 2 bp staggered nicks at each of the 10 bp 
repeat, box b sites, found within pacC (28, 29, 30, 32, 37, 58, 
59). G1P enhances approximately 4-fold the G2P ATPase 
activity, but exerts a negative effect on its endonuclease 
activity (59). Presence of G6P enhances further the G2P 
ATPase activity, probably because the proteins form a 
complex that becomes active for DNA translocation (25). 
The DNA end protected by the terminase (encapsidated 
end) is then translocated into the prohead in an ATP-driven 
reaction, whereas the nonencapsidated end {pac L) is 
degraded (25, 32, 59; figure 23-8). 

The SPP1 DNA packaging reaction was reproduced 
in vitro by combination of extracts of B. subtilis cells infected 
with SPP1 mutants that are defective either in synthesis 
of one terminase subunit or in the assembly of the 
procapsid (44). Biologically active procapsids are present in 
the first extract (procapsid donor) and terminase in the 
second (terminase donor), respectively. SPP1 DNA is present 
in both extracts. Upon mixture of the two extracts all 
requirements for phage DNA packaging are present and 
infective virions assemble. A noteworthy result is that 
the DNA packaged originates exclusively from the labile 
terminase donor extract. Terminase thus appears to be 
already loaded in the substrate DNA when extracts are 
mixed while free active enzyme that could bind SPP1 
DNA in the procapsid donor extract is not available. 
The terminase-DNA complex was proposed to be signifi¬ 
cantly more stable than free terminase and would be the 
active form for the packaging reaction under conditions 
where procapsid assembly and DNA encapsidation are 
uncoupled (44). 

DNA Translocation 

The SPP1 DNA-terminase complex docks at the portal 
vertex of the procapsid to initiate DNA packaging. 
This complex probably contains a terminase that is simul¬ 
taneously bound to the SPP1 pac site and to the pro¬ 
capsid (figure 23-8). DNA translocation is then initiated 
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Figure 23-8 DNA packaging in bacteriophage SPP1. The 
SPP1 DNA packaging has been divided in five different steps. 
1: The terminase complex (G7P-G2P, represented as gray 
ovals and smaller gray circles, respectively) is assembled at 
one pac site of the phage DNA concatemer and recognizes 
the portal protein region of the procapsid (representation 
of the individual components is as in figure 23-5). The 
arrow indicates the staggered nicks in pacC (see figure 23-7). 
2: After DNA cutting at pac the encapsidated end (pocR) is 
translocated into the procapsid and the nonencapsidated 
end (pacL) is being degraded. 3: With translocation of 
the DNA the scaffolding protein is released and the capsid 
shell expands. 4: When the capsid is filled with DNA the 
G6P sensor will give the signal for cutting the DNA and 
transfer the terminase-DNA concatemer complex to 
another procapsid. 5: After the headful cut, DNA encapsi- 
dation starts again with a new procapsid. The DNA-filled 
capsid proceeds to the final steps of phage morphogenesis. 
See thebacteriophages.org/frames_0230.htm for color 
version of this figure. 


with a concomitant exit of the scaffolding protein G21P from 
the procapsid interior, thereby creating empty space for DNA 
packing. The DNA packaging motor is likely composed of the 
terminase, whose ATPase activity fuels DNA translocation, 
and of the portal protein through whose central channel 


the DNA is pumped. A significant number of point mutations 
in G6P block or reduce the efficiency of SPP1 DNA packag¬ 
ing, confirming its central role in this process (66). 

The molecular mechanism of how viral DNA is mechan¬ 
ically translocated to the capsid interior is not under¬ 
stood in any phage system. It is not clear whether the 
DNA transport is a linear translocation or whether it 
involves a relative rotation between portal protein and 
DNA. Dube et al. (46) proposed a model for DNA trans¬ 
location that exploits the symmetry mismatch between the 
helical symmetry of DNA and the symmetry of the portal 
protein. In this model the portal protein would rotate rela¬ 
tive to the capsid and to the DNA helix, as initially proposed 
by Hendrix (61), to keep a constant positioning of the 
portal subunits relative to the DNA being packaged. This 
constant topology would be necessary for the portal sub¬ 
units to carry out mechanical translocation of DNA in 2 bp 
steps powered by the terminase ATPase activity. The DNA 
would be translated to the capsid interior without signifi¬ 
cant rotations relative to the capsid. The model was formu¬ 
lated for a 13-mer symmetric portal protein but applies 
also to a 12-mer oligomer (46). Alternatively, G2P bound to 
DNA pulls it unidirectionally toward the G1P-G6P com¬ 
plex in an ATP-dependent process. The translocating G2P 
hydrolyzes ATP during its translocation along the linear 
lattice of double-stranded DNA. The translocated DNA is 
then passed through the central hole of the G6P turbine. 
This translocating mechanism resembles the inchworm 
model initially described for single-strand translocases 
or DNA helicases (see 108). Once the capsid is full, by 
an uncharacterized step, G6P might transfer the headful 
signal to G2P. Translocase activity would be subsequently 
attenuated, and a weak nuclease activity would be activ¬ 
ated (25; see also below). 

During DNA packaging the subunits of the major 
capsid protein, G13P, undergo a conformational change 
which results in both expansion of the shell to the 
mature size and an increased angularity of the structure 
(figure 23-5 and 23-8; 12,45). 


Packaging Termination: The Headful 
Packaging Mechanism 

When the head is full, the DNA inside the shell is sepa¬ 
rated from the DNA remaining outside by a headful 
cleavage. This sequence-independent and also imprecise 
cut (65, 119) is determined by the amount of DNA present 
inside the capsid. Siz mutations, which lead to under¬ 
sizing of the DNA packaged, were only identified in the gene 
coding for the SPP1 portal protein G6P (120; see 27 for a 
related phenotype in bacteriophage P22). The most severe 
of these mutations are associated with deletions of non- 
essential regions of the SPP1 genome. The deletion was 
required to ensure presence of the complete set of phage 
essential genes in the shorter packaged chromosome. 
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Combination of siz mutations leads to encapsidation of 
DNA molecules significantly smaller than SPP1 mature 
chromosomes of single siz mutants (1T8). Additionally, the 
efficiency of DNA packaging is reduced leading to the 
suggestion that a trigger for headful cleavage could be 
the incapacity of the packaging machinery to encapsi- 
date further DNA into the capsid (65, TT8). Comparison 
between the structures of the SPP1 wild-type protein and 
a G6P form carrying a siz mutation showed significant 
differences in the crown. This crown is a flexible domain 
that delimits the central portal channel at the wider region 
of the G6P structure (figure 23-6; 93, 94). Its orientation 
to the capsid interior provides an ideal topology to act as 
a sensor of the mass of packaged DNA. The increasing 
pressure exerted by the DNA on the sensor domain was 
proposed to trigger a conformational change in this 
domain leading to immobilization of DNA inside the portal 
channel or to a stopping of the packaging machinery 
activity (93). This signal could alter the stoichiometry of the 
G6P:G2P:G2P translocase complex. An imbalance in the 
G6P:G2P:G2P complex, perhaps a low amount of G2P or 
an excess of G6P, “induces” G2P to introduce a nonspeci¬ 
fic cleavage (headful cut), separating the packaged DNA 
from the concatemer (25, 59). The headful cleavage generates 
a new DNA end that serves as the starting point for a 
subsequent round of DNA packaging (32, 37, T18-120). 

Stabilization of the Packaged DNA 

After the terminase departs, the portal pore is plugged by 
the head completion proteins, G25P and G26P, to avoid 
release of the packaged DNA (22, 76, 94). A mutation in G6P 
was shown to block specifically this step (see note added in 
proof on page 345). The structure assembled at the portal 
vertex (connector) comprises stacked rings of G6P, G25P, 
and G26P subunits crossed by a central channel that is 
closed at the level of G26P (figure 23-6: 76, 94). In phages 
whose capsid was disrupted, the connector remains 
attached to the tail structure through the G26P ring region 
and the complex has been shown to protect a DNA fragment 
of about 20 nm (1T9). This size fits particularly well to the 
connector height (figure 23-6). The DNA extremity fixed to 
the connector was the last one to be packaged and is the 
first to exit the virion at the beginning of infection (119). 
The connector thus acts as a valve that retains DNA inside 
the capsid until a signal for release is triggered by interaction 
of SPP1 with its receptor at the B. subtilis cell surface. 
This signal must be transmitted through the complete tail 
structure from the tail fiber to the connector (figure 23-1). 

Tail Assembly 

The SPP1 tail is assembled in a morphogenetic pathway 
independent from capsid assembly. Its flexible helical 
region is composed of G27.2P and of a lower amount of 


G27.2*P (figure 23-1). The two proteins have the same 
N-terminus but G27.1 *P has a higher molecular weight than 
G27.2P (48; A. Droge and F. Weise, personal communica¬ 
tion). No information is available about the components of 
the SPP1 tail fiber. Attachment of the tail to the connector, 
most probably to G26P (76), yields the SPP1 infective phage. 

Lysis 

Lysis of B. subtilis cells infected at 37 °C with SPP1 starts 
around 30 minutes post-infection. The mechanisms of lysis 
and how its timing is controlled were not studied. Two 
proteins that share identity with holins (G26P) and N- 
acetylmuramoyl-l-alanine amidases (G25P) are encoded by 
the SPP1 genome (3). These are the most likely effectors of 
lysis (see chapter 10). 

SPP1 as a Genetic Tool 

Transduction 

Bacteriophage SPP1 mediates generalized transduction of 
host chromosomal markers (134) and plasmid transduc¬ 
tion (26). The substrate for packaging of transducing DNA 
can be either chromosomal DNA of the host or concate- 
meric plasmid DNA. Both substrates are packaged by an 
identical mechanism. 

SPP1 transducing particles transfer approximately 1% of 
the host B. subtilis chromosome from a donor to a receptor 
strain with an efficiency of 10~ 7 to 10~ x transductants per 
viable phage (134). This feature made SPP1 a tool widely 
used for genetic mapping of the B. subtilis chromosome. 
The transducing particles are indistinguishable from 
SPP1 virions with the exception that they carry a DNA 
molecule of bacterial origin with a size similar to that of 
the SPP1 chromosome (39,40). 

Plasmids, which are usually smaller than a bacterio¬ 
phage genome, can be encapsidated into phage procapsids 
to produce plasmid transducing particles (74). The encapsi¬ 
dation of linear plasmid concatemers has been detected 
in all bacteria-phage systems tested so far (2, 24, 38, 70, 
92, 105). Plasmid transduction frequencies are generally 
low, about 10~ x to 10~' per active phage particle. However, 
when homology is provided between the phage genome 
and the plasmid, transduction frequency increases up to 
10 5 -fold (transduction facilitation effect) (2, 38, 92, 105). 
With B. subtilis phage SPP1, homology of as little as 47 bp 
provided a maximal facilitation effect (2). Beyond this 
observation, however, the molecular events that trigger 
the synthesis of packagable concatemeric plasmid DNA are 
not well understood. In bacteria deficient in ExoV (also 
termed RecBCD or AddAB activity) the accumulation of 
high molecular weight (hmw) linear head-to-tail plasmid 
concatemers was observed (34,130). 
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SPP1 infection of B. subtilis cells bearing plasmids 
induces the synthesis of concatemeric plasmid DNA, and 
such hmw DNA synthesis is independent of plasmid- 
encoded replication functions (19). Accumulation of hmw 
plasmid DNA, starting simultaneously with the onset of 
SPP1 sigma replication (24), was dependent on phage- 
encoded function(s) but independent of host-encoded 
primosomal functions (see 131). 

The molecular events that trigger the synthesis of 
packagable head-to-tail chimeric SPPLplasmid concate¬ 
meric SPP1 DNA might partially follow the mechanism 
used by SPP1 RDR (see above). Upon SPP1 infection, a 
phage-encoded product might directly or indirectly inacti¬ 
vate the AddAB enzyme with the subsequent accumula¬ 
tion of plasmid stalled replication forks. The repair of this 
stalled replication fork requires the SPPl-encoded G35P 
and G34.1P recombination proteins to create a D-loop (a 
single-stranded 3'-0H-end of phage DNA or plasmid DNA 
invading a homologous region of the dsDNA plasmid 
molecule) (R. Missich and S. Ayora, personal communica¬ 
tion). The synapsis between a phage and a plasmid mole¬ 
cule would lead to the generation of a phage:plasmid 
DNA chimera carrying the pac site. The terminase enzyme 
recognizes and cleaves the pac site present in the chimera to 
initiate unidirectional and processive “headful" packaging 
(32, 70). In the absence of apparent homology between 
the plasmid and the phage, however, the hmw DNA could 
be packaged at a very low frequency into empty procap¬ 
sids by an uncharacterized mechanism. 

Transfection 

Exposure of competent bacterial cells to bacteriophage 
DNA, leading to the production of intact viruses in such 
cells, was first observed in B. subtilis and its phage 
SP50 and called “transfection" (a recombinant word from 
trans formation and in fection ) (52). A peculiar feature of 
transfection in B. subtilis is its dose-response, that is the 
relationship between the concentration of input phage DNA 
used in the transfection experiment and the number of 
cells yielding a phage burst (123). Whereas a linear relation¬ 
ship is observed in infection, transfection is always charac¬ 
terized by a quadratic or even higher order dependence 
on the DNA concentration. This dose-response was obser¬ 
ved over a wide range of DNA concentrations. It is assumed 
that this relationship reflects different modes of entry of 
phage DNA molecules into the cell: In infection, the mature 
phage chromosome is transported into the interior of the 
cell by the phage ejection mechanism. This dsDNA mole¬ 
cule that contains the complete viral genome will immedi¬ 
ately enter the replication cycle. Transfection, however, 
follows the entry mode of bacterial or plasmid transfor¬ 
mation in B. subtilis. Here double-stranded transforming 
DNA is fragmented following attachment to competent 
cells (47). Of each fragment only one strand is taken up by 


the cells, the other being degraded (96). Most likely, such 
uptake is polar with respect to the 5' to 3' orientation of 
DNA, as in S. pneumoniae (82). In such a scenario only the 
uptake of at least two phage DNA molecules in indepen¬ 
dent entry events will provide a transfected cell with an 
ensemble of phage DNA fragments derived from strands 
with different strand polarity. Alignment of such fragments 
by annealing within regions of sequence complementa¬ 
rity, followed by gap filling DNA synthesis, would generate 
a replicative molecule. Fragmentation of transfecting DNA 
as the cause of a multi-powered dose-response was first 
suggested for transfection with B. subtilis phage SP82 (57). 
The processing of transfected DNA as described is com¬ 
patible with the observation that, in contrast to transforma¬ 
tion, transfection is sensitive to restriction (122,124). Phage 
DNA loses transfecting activity following degradation by 
a restriction endonuclease since, in spite of uptake of 
such DNA, no contiguous DNA molecule can be formed 
intracellularly. In line with the model of DNA processing 
discussed, however, combinations of two endonuclease 
digests have transfection activity, provided the restriction 
fragments generated have large overlaps (64,120). 

If transfection is performed by mixtures of two geneti¬ 
cally distinguishable SPP1 DNAs differing in two genes, 
where recombinants can be scored in the progeny, recombi¬ 
nation is significantly enhanced over the level observed 
in conventional phage crosses (110). This would be antici¬ 
pated with the patching-up model described. 

The postulated uptake mechanism in transfection also 
explains the low efficiency of transfection in compari¬ 
son with infection. Apparently many cells take up phage 
DNA without being able to process it to replicative progeny. 
Such abortive transfection becomes obvious under marker 
rescue conditions, when competent cells were infected 
with conditionally lethal mutant phages simultaneously 
or prior to exposure to wild-type phage DNA. Under such 
conditions, in which the wild-type allele of the DNA 
can substitute for the mutant allele of the “helper" phage, 
the efficiency of transfection is dramatically enhanced, 
and the dose-response of transfection becomes linear. 
Also, fragmented wild-type DNA, otherwise inactive, will 
serve under these conditions. Such marker rescue experi¬ 
ments using a wide spectrum of mutant helper phages 
and individual SPP1 DNA restriction fragments have led to 
an unambiguous assignment of mutations to restriction 
fragments, thus providing a coarse correlation between the 
genetic and physical maps of the phage (13). 

Purified complementary single strands of SPP1 DNA 
(neither of which is alone active in transfection) could be 
annealed to produce transfecting molecules. When such 
annealing was performed with single strands derived 
from genetically different and distinguishable SPP1 phages, 
one could generate in each combination two types 
of heteroduplex molecule, in which the alleles from each 
“parental" molecule were carried either on the “heavy” 
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or “light” strand. Performing single-burst analyses of trans¬ 
fections with such heteroduplex DNA in combinations 
of wild-type DNA and the DNAs of a large number of 
mutants, we observed pure bursts of either wild-type, or 
mutant, or mixed bursts (109, 125). ft was intriguing that 
in a given pair of heteroduplexes the frequencies of pure 
wild-type and pure mutant bursts were not identical. 
There was a strong bias for one of the two phenotypes, 
which was strand-specific rather than allele-specific. This 
observation suggests strand discrimination before the 
onset of phage replication. At the time when these results 
were obtained, neither the intricacies of DNA processing in 
transfection nor the architecture of SPP1 molecules were 
known. Therefore we assumed that a heteroduplex mole¬ 
cule would be taken up as such and that the mismatched 
site would induce repair of the mismatch similar to a gene 
conversion process (“conversion model”). At the present 
stage this interpretation must be revised on the basis of 
information available on DNA uptake, ft is conceivable 
that in the uptake of heteroduplex molecules one strand 
is lost with a higher frequency than the other and that 
the strand preferentially taken up serves as a template in 
the patch-up synthesis described before (“strand-selection 
model”). However, neither the “conversion model” nor 
the “strand-selection model" can explain the strong strand 
bias observed in our experiments. Bacterial and/or phage 
mutations deficient in mismatch repair, if available, would 
be helpful in assessing the contribution of each of the 
two processes to the known observations. 

SPP1 as a Cloning Vector 

SPP1 phages have also been useful as cloning vectors. In 
these constructs part of the dispensable region of EcoRI 
fragment 1 of SPP1 DNA was eliminated and a unique re¬ 
striction site, either BfzmHI (SPPlv; 60) or Pstl (SPPlvic; 14), 
engineered into the fragment. Restriction fragments 
of heterologous DNA of up to 4 kb could be maintained and 
expressed in these vectors. 


Evolution 

The SPPI-Like Phages 

The B. subtilis phages pl5, SF6, and 41c (phage 22a is a 
different isolate of the same virus), isolated in different 
geographical regions (67, 114), are highly related to bac¬ 
teriophage SPP1 by serological, biological, biochemical, and 
genome sequence similarity criteria (77, 103, 104; see 
‘Adsorption and Infection”). DNA heteroduplex analysis 
showed that all phage DNAs shared extensive regions of 
homology interrupted by short nonhybridizing regions. 
Crosses between the phages yielded viable hybrid progeny, 


showing that this group of viruses share a common genet¬ 
ic pool (101; M.A. Santos, personal communication). 

SPP1, the Prototype of a Wide Croup of 

Cram-Positive Phages 

SPP1 proteins exhibit similarity to the amino acid sequence 
of a variety of phage proteins that infect other Gram¬ 
positive bacteria of genera such as Streptococcus, Lactoba¬ 
cillus, Listeria, and Staphylococcus (3, 21, 32, 73, 95,113,115; 
chapter 4). Detectable similarity between SPP1 and each 
of those phages is limited to a few proteins. Biochemical 
and/or functional studies in the SPP1 system showed 
that these correspond frequently to groups of proteins that 
interact during SPP1 multiplication (e.g., the two terminase 
subunits G1P-G2P (58); G6P-G7P (45, 115); the connec¬ 
tor proteins G25P-G16P and phage tail proteins (76, Droge, 
personal communication). This observation suggests that 
the homologous proteins from other phage species have 
a function similar to SPP1 proteins. Constraints generated 
by the interaction between proteins thus appear to be 
an important factor that might slow down their sequence 
divergence due to coevolutionary requirements. This feature 
would facilitate identification of relationships between 
phages. Functional, biochemical, and structural informa¬ 
tion is a major requirement for understanding the forces 
driving virus evolution. The wealth of genetic and biochem¬ 
ical data on the molecular mechanisms that support the 
different aspects of the SPP1 life cycle strongly recommend 
it as the prototype for the group of phages described above 
that includes species of medical and economic interest. 

Detectable relatedness of SPP1 proteins at the amino 
acid sequence level is essentially found with Siphoviridae 
phages infecting bacteria closely related phylogenet- 
ically but that can occupy different ecological niches. The 
evolution of these phages of Gram-positive bacteria seems 
therefore to be mostly shaped by the evolutionary history 
of their hosts. Host specificity appears as an effective 
barrier that avoids frequent horizontal transfer between 
virus infecting distantly related bacteria, even though 
some exceptions were identified (23). Sequence similarity is 
eroded by the long-distance evolution of phages, but anal¬ 
ysis of their gene order and of the molecular strategies 
used for viral multiplication provides evidence of ancient 
evolutionary relatedness between SPP1 and the other 
tailed bacteriophages (see below; 23). 

Gene Order and Biochemical Strategies: 

Long Term Evolutionary Conservation 

The SPP1 genomic organization in the packaging and 
morphogenesis operons and in the initiation-of-replication 
region resembles those in coliphages X and P22, but no or 
very little conservation was detected at the level of the 
deduced amino acid sequence (32, 36,48, 95). 
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Conservation of the genomic organization in the packag¬ 
ing and head morphogenesis operons in the absence of 
amino acid sequence similarity between analogous proteins 
can be the result of two different types of evolutionary 
mechanisms. The amino acid differences between isofunc¬ 
tional proteins of phylogenetically distant phages infecting 
Gram-positive and Gram-negative hosts can be attributed 
if the principle of neutral evolution applies to a constant 
rate of amino acid substitution after divergence from a 
common ancestor. The order of the genes would be kept and 
reflect the arrangement present in a common phage ances¬ 
tor. This hypothesis would imply a coevolutionary process 
that, in spite of drastic modifications in the polypep¬ 
tides, would still allow productive interaction between the 
various morphogenetic proteins. Alternatively, we can 
assume the existence of different ancestors, with morpho¬ 
genetic genes brought together by convergent evolution. 
This would readily explain the nonrelatedness among 
primary structures of the proteins but also would imply 
that: (i) numerous proteins evolved independently to 
accomplish the same function in the construction of a 
complex macromolecular structure using a remarkably 
similar assembly strategy, and (ii) the genes independently 
became clustered in a conserved order in distant lineages. 

Independent of the evolutionary mechanism that pre¬ 
served or, alternatively, brought together morphogenetic 
genes, the conservation of the gene arrangement indicates 
that it is an advantageous property for the phage. The driv¬ 
ing force for this process could be the selective pressure 
to cluster cistrons whose products intensively interact 
(111) such that a genomic module would correspond to a 
functional unit (the replication apparatus, the packaging 
machinery, etc.). Fisher (53) has predicted that if some 
allele combination is superior to other combinations, such 
linkage can be transmitted intact, suffering only minimal 
disruption by recombination. 

Note Added in Proof 

The Bacillus subtilis receptor for bacteriophage SPP1 was 
recently shown to be the transmembrane protein YueB. 
(Sao-Jose, C., C. Baptista, and M.A. Santos. 2004. Bacillus 
subtilis operon encoding a membrane receptor for bacterio¬ 
phage SPP1. J. Bacteriol. 186:8337-8346.). 

Mutations in the SPP1 portal protein G6P that block 
specifically plugging of the portal pore after termination of 
DNA packaging were recently described (Isidro, A., A. 0. 
Henriques, and P. Tavares. 2004. The portal protein plays 
essential roles at different steps of the SPP1 DNA packaging 
process.Virology 322:253-263. 
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Bacteriophage PI 

HANSJORG LEHNHERR 


I n the first edition of this volume, Michael B. 

Yarmolinsky and the late Nat L. Sternberg wrote a 147- 
page chapter filled to the brim with scientific, historical, 
and anecdotal information about bacteriophage PI (198). 
Their work was the first comprehensive review covering 
35 years of PI research. Following in their footsteps, this 
review inevitably relies heavily on its predecessor. Any 
reader with an interest in PI is strongly advised to also read 
the Yarmolinsky and Sternberg review, which extensively 
covers aspects of PI biology that are still valid and up to 
date and therefore not covered here. 

It is the aim of this account to provide the reader with a 
general overview of bacteriophage PI biology, with special 
focus on developments during the last 15 years. The data are 
arranged to reflect a lytic phage infection cycle, with an 
inbuilt detour covering the plasmid life style of PI during 
lysogeny. 


Brief History 

In 1951 Giuseppe Bertani managed to isolate three temperate 
bacteriophages from the Escherichia coli strain of Lisbonne 
and Carrere (14). The phages he baptized PI, P2, and P3 
were serologically distinct and differed also in plaque size, 
with PI showing the smallest plaques on a slightly patho¬ 
genic Shigella dysenteriae host. As plaque size was a very 
important feature for early studies on bacteriophages, 
Bertani himself opted against PI and devoted much of his 
long life in science to the investigation of bacteriophage P2 
(15). Phage P 3 received very little attention and is nowadays 
almost completely forgotten. Bacteriophage PI might have 
suffered a similar fate if not for Lennox (122), who found in 
1955 that PI is able to mediate generalized transduction. 
Since then PI has been widely used as a tool to construct 
new bacterial strains with specific genotypes. Generalized 
transduction also provided an invaluable tool in the fine 
mapping of the E. coli chromosome in times when no 
genome sequencing efforts were possible. Accordingly, a 


stock of bacteriophage PI can be found in almost every 
molecular biology laboratory around the world. Maybe as a 
consequence of this omnipresence, bacteriophage PI served 
as a model organism in the study of many fundamental 
aspects of bacterial and phage biology, such as DNA restric¬ 
tion modification (8, 45), site-specific recombination (68), 
plasmid replication (24), partition (9), incompatibility (173), 
and addiction (118). Next to the bacteriophages T4 (chapter 
18) and X (chapter 27), PI is among the best characterized 
today (119). Many aspects of its life cycle are understood in 
molecular details and also the nucleotide sequence of the 
entire PI genome has recently been determined (124). 


Early Infection 

Morphology 

Figure 24-1 shows an electron micrograph of a bacterio¬ 
phage PI particle. The phage genome is packaged into a 
head structure of icosahedral symmetry. The head is con¬ 
nected to an inflexible tail, formed by a rigid tail tube that is 
covered by a contractile sheath. The distal end of the tail 
is formed by a baseplate, to which six kinked tail fibers 
are attached (186). The general morphology of a PI particle 
(187) is very similar to the morphology of other tailed bacte¬ 
riophages such as Mu (chapter 30) or the group of T-even 
phages. Surprisingly though, there is very little amino acid 
sequence conservation found between these phages (106). 
The structural proteins of bacteriophage PI, with the excep¬ 
tion of the tail fibers (64,153), have not been characterized 
in detail. 

Host Specificity 

Bacteriophage PI has a relatively wide host range, including 
a variety of Gram-negative species (198). The host specificity 
is governed by the cix-cin site-specific DNA inversion system 
(86, 99). As illustrated in figure 24-2, inversion of a 4.2 kb 
C-segment leads to the expression of alternate sets of tail 
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Figure 24-1 Electron micrograph of a bacteriophage PI 
particle. The black bar represents 100 nm. Courtesy of 
M. Wurtz, Biocentre of the University of Basel, Switzerland. 

fiber genes (R, S, EJ or R, S', U'). Phages carrying RSEJ fibers 
were shown to be able to infect E. coli K12, E. coli B, and E. 
coli C, while phages carrying RS'U' fibers do not normally 
infect these strains (97,198). Inversion rarely occurs during 
lytic growth, but happens sufficiently frequently during 
lysogenic growth to completely randomize the orientation 
of the C-segment in a population. Thus phage stocks induced 
from a PI lysogenic strain will contain equal numbers of 
particles with either host specificity (97,99). 

The precise inversion reaction is mediated by the 21.2 kDa 
Cin recombinase (67, 86). Crossing over occurs between two 
26 bp cix sequences, flanking the C-segment. The cix sites 
contain imperfect inverted repeat sequences, which serve 
as recognition sites for the invertase protein (94, 95). Beside 
the Cin recombinase and the cix sites, also the host factor for 
inversion stimulation, FIS (67,108,110), and a 72 bp enhan¬ 
cer sequence, sis, located within the cin gene (90, 93), are 
required for efficient inversion. FIS binds to and bends the 
enhancer sequence (92, 93), providing the mould for the 


assembly of a precisely ordered synaptic Cin-cix complex, 
stimulating the efficiency of the inversion reaction while 
simultaneously preventing the formation of deletions (67). 

Tail fiber variation by DNA inversion is not a unique 
feature of bacteriophage PI. Related systems include gin of 
phage Mu (105), pin of the defective prophage el4 (144), min 
of the defective plasmid-prophage pl5B (100, 153, 154) and 
ein of Carotovoricin Er (140). In the bacterium Salmonella 
typhimurium an analogous hin system controls the expres¬ 
sion of surface antigens (163). The respective recombinases 
show high degrees of similarity and, where tested, are 
functionally interchangeable (60). 

Adsorption and Injection 

The tail fibers promote the recognition of a potential host 
bacterium. The PI receptor has been identified to be a ter¬ 
minal glucose moiety of the lipopolysaccharide core of the 
bacterial outer membrane (155). The interaction of at least 
three of the six tail fibers with specific receptor molecules is 
assumed to be sufficient to trigger the injection mechanism 
(34). The phage tail contracts, the tail tube is pushed through 
the baseplate, punctures the outer cell membrane and, 
according to most models (43), is also pushed through the 
bacterial cell wall. The latter process is facilitated by a 
phage-specific enzyme with mureolytic activity (116; H. 
Lehnherr, unpublished data), a so-called lytic transglycosy- 
lase (49). Previous studies linked the cell-wall-degrading 
activity of lytic transglycosylases to host cell lysis (49,109), 
but recent evidence for the phages T7 and PRD1 demon¬ 
strated their role at the onset of an infection cycle (135,151). 

The content of the phage head is then injected into the 
periplasmic space of the host cell. The uptake of the PI 
genome from the periplasm into the cytoplasm is mediated 
by an as yet uncharacterized pore in the inner membrane. 
Lobocka et al. (124) proposed that in analogy to the mecha¬ 
nism found in bacteriophage T7 (57, 176), a large internal 
head protein of bacteriophage PI, DarB (101), might form 
the required inner-membrane pore. 

Another mechanism connected to the injection process 
is the sim (107) superinfection exclusion system. The Sim 
protein is synthesized as a 29.5 kDa precursor protein, and 
SecA-dependent cleavage of a 20 amino acid, hydrophobic 
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Figure 24-2 Organization of the DNA inversion system of PI. The two cix recombination sites flanking the invertible 
C-segment, and the enhancer sequence sis, located within the cin gene, are shown as black boxes. Two promoters, P s , 
expressing the tail fiber operon and P C in. expressing the cin gene are shown as filled triangles. Genes are shown as open 
boxes with arrowheads indicating the direction of transcription. In the depicted orientation the three tail fiber proteins 
R, S, and U, are produced. A single inversion event would fuse Sc to Sv', resulting in the expression of the three proteins R, 
S', and U'. 
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leader peptide is essential for its function (129). Sim is rapidly 
expressed immediately after infection, and the mature 27.4 
kDa protein is located in the inner membrane or the peri- 
plasmic space of the infected host cell (129). A plausible, 
though yet unproven hypothesis is that the Sim protein 
traps superinfecting PI genomes in the periplasm, thus 
preventing them from interfering with the timing of an 
ongoing PI infection cycle (107,129). 

Upon entering the cytoplasm the linear PI DNA is rapidly 
circularized in order to avoid degradation of the phage DNA 
by cellular nucleases (174). Circularization is mediated either 
by Cre-dependent site-specific recombination using the pre¬ 
sence of two lox -Cre sites in some phage genomes (87), by 
homologous recombination between the terminal redun¬ 
dant ends present in all PI genomes or by other, yet unchar¬ 
acterized mechanisms (198). 

Restriction and Anti-restriction 

Bacteriophage PI specifies a type III restriction-modification 
system, which provides a PI lysogen with a defence mecha¬ 
nism against superinfection by heterologous bacteriophages 
(17). Though the first of the three types of restriction- 
modification systems (17) to be discovered (8, 45), type III 
systems trailed in being characterized. It is mainly research 
in the last decade that has shed light on their activity. Two 
large proteins—the 74 kDa modification subunit. Mod, and 
the 111 kDa restriction subunit, Res (91)—form the EcoPlI 
restriction-modification enzyme complex, with a subunit 
composition of Res(2)Mod(2) (104). Sequence-specific bind¬ 
ing to an asymmetric recognition sequence, AGACC (11), 
which is required for both methylation and restriction, is 
mediated by the modification subunit (6, 7, 66, 89). As a 
consequence of the asymmetry, only the adenine marked A 
above is methylated by the Mod subunit, and the second 
strand of the recognition sequence, lacking an adenine, 
remains unmethylated. Figure 24-3 illustrates that upon 
replication of such hemimethylated sites, totally unmodified 
sites in the same orientation arise. For most restriction- 
modification systems, unmodified sites represent the signal 
to cleave DNA molecules that contain them (17). However, 
newly replicated DNA, of either PI or host origin, is not a 
substrate for EcoPlI cleavage. This conceptual paradox was 
resolved when it was shown that two unmethylated sites, 
located in inverse orientation, are required for restriction 
activity (132,133). The EcoPlI enzyme complex does bind to 
unmethylated sites and starts tracking along the DNA, using 
ATPase and helicase activities (104, 152, 199), but only the 
collision between two convergently tracking enzyme com¬ 
plexes triggers the actual cleavage reaction (104,133). 

The introduction of an EcoPlI restriction-modification 
system into a new host cell upon PI infection is potentially 
suicidal for the phage, as the unchecked expression of 
restriction activity could result in the destruction of the 


unmethylated host genome (146). Especially the establish¬ 
ment of lysogeny would be severely hampered by such an 
activity. However, it was observed that while the modifica¬ 
tion activity is expressed immediately after DNA injection, 
restriction activity could only be detected with a consider¬ 
able delay, allowing the modification subunit, which also 
acts as a monofunctional modification methyltransferase 
(6, 7), to completely methylate the host DNA (146). Such a 
sequential expression was not intuitively expected from the 
organization of the mod and res genes in a single operon (91, 
98). Several lines of evidence now indicate that controls at 
both the transcriptional and, predominantly, the transla¬ 
tional level are responsible for the delayed expression of the 
restriction subunit, providing the modification subunit with 
a head start (146,160). 

Newly injected bacteriophage PI DNA is protected 
against degradation by type I restriction modification sys¬ 
tems specified by Enterobacteriaceae (17,101,139). Responsi¬ 
ble for this protection are two defences against restriction 
proteins: DarA and DarB. The DarA protein is synthesized 
as a precursor, which runs with an apparent molecular 
weight of 77 kDa during SDS-PAGE (175), though its calcu¬ 
lated molecular weight is only 69.5 kDa (96). DarA is then 
processed by a particle maturation protease (175), and the 
mature form, with an apparent molecular weight of 68 kDa, 
is packaged into phage particles (175). DarB is also an 



Figure 24-3 Replication fork in a PI lysogen. The asymmetric 
EcoPI recognition sequences are represented by 
differentially shaded arrows, with the arrowheads pointing 
in the direction of the cleavage sites. In the parental DNA 
molecule all EcoPI sites are methylated (indicated by 
circles). In the two daughter molecules only those sites that 
inherited a methyl group are modified. However, as in each 
daughter molecule, all unmodified sites have the same 
orientation, no convergently tracking enzyme complexes 
will collide and thus the daughter molecules are not cleaved. 
EcoPI enzyme complexes trailing the replication forks will 
then de novo modify the unmetylated sites, ensuring 
continued protection for the next round of replication (133). 
Modified from Meisel et al. (132). 
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internal head protein and both DarA and DarB are injected 
into the host cell alongside the phage DNA, where they act 
exclusively in cis (101). The Dar proteins do not directly 
inactivate the type I restriction-modification enzymes, and, 
as PI DNA isolated from phage particles was not protected 
against type I restriction In vitro, Iida et al. (101) concluded 
that the protection was not due to a chemical modification 
either. Rather, as type I restriction enzymes track along the 
DNA from specific recognition sites to unspecific cleavage 
sites (17, 48), it was proposed that the Dar proteins act by 
sterically blocking these tracking movements. However, 
latest results indicated that DarB contains a domain which 
is highly similar to a DNA-methyltransferase (7, 124), and 
thus it can not be excluded that rapid methylation during 
the injection process, or immediately following it, might be 
at least partially responsible for the observed protection. 


Lytic or Lysogenic Growth 

Being a temperate bacteriophage, two life strategies are open 
to PI. It can either stably lysogenize the host and then repli¬ 
cate as a unit-copy number plasmid (102) or, alternatively, 
pursue a lytic growth cycle, resulting in the release of 
progeny phage particles (134). At the molecular level the 
decision between the two alternative pathways is regulated 
by the components of the complex tripartite immunity 
system, which has been thoroughly investigated by many 
members of the former research group of the late H. 
Schuster (80). Figure 24-4 provides a schematic illustration 
of the PI immunity circuit with the three immunity regions 


immC, imml, and imrrfT. Immediately after injection of the 
PI genome, seven PI immunity functions are expressed. 
Their interplay, depending on the physiological conditions 
of the host, determines the outcome of the infection. Under 
optimal conditions with cells growing in rich medium at a 
low temperature of 20°C, up to 30% of the cells are lysogen- 
ized in E. coli (150). 

imm C 

The basic molecular switch is located in the immC region of 
the PI genome, containing two genes encoding the major 
repressor protein, Cl (44, 47), and the Cl repressor inactiva¬ 
tor protein, Coi (83). The Cl repressor is a 32.5 kDa protein 
which binds to 22 operator sequences scattered widely over 
the PI chromosome (81, 124). Cl recognizes asymmetric 17 
bp sequences with the consensus ATTGCTCTAATAAATTT 
(81). The Cl repressor binds as a monomer to a single binding 
site and two monomers bind non-cooperatively to double 
binding sites (13, 82). Upon binding, Cl blocks the access of 
RNA polymerase to promoters located in the vicinity of the 
operator sites and thus represses the expression of genes 
associated with these promoters (81). 

The Coi protein is a 7.7 kDa antirepressor protein which 
forms a 1:1 complex with the Cl repressor, thereby inactivat¬ 
ing it (84). The Cl repressor has the potential to turn off the 
expression of Coi, as one of the Cl binding sites overlaps the 
promoter expressing the coi-cl operon. The initial basic 
switch thus relies on the relative synthesis rate of the Coi 
and Cl proteins. An excess of Coi over Cl leads to lytic 
growth, while the opposite ratio promotes lysogeny. 


immC immT imml 




Figure 24-4 Organization of the tripartite PI immunity system. The figure is not drawn to scale. Various promoter 
sequences, which are recognized by the E. coli RNA polymerase associated with a 70 , are indicated by triangles. Binding 
sites for the Cl repressor protein are shown by black bars. 
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imm I 

Additional functions then further complicate the immunity 
switch. The antirepressor Ant, like Coi, can bind and inacti¬ 
vate Cl, thus promoting lytic growth. Ant is a heterodimeric 
complex of two proteins, Anti (38.7 kDa) and Ant2 (29.0 
kDa), which are encoded in the imml region. Both subunits 
of Ant are expressed from a single gene, with the translation 
of the second protein starting at an in-frame start codon 
(147). 

In order to establish lysogeny the expression of Ant has to 
be repressed and this goal is only partially achieved by the Cl 
repressor (see figure 24-4). Though there is a Cl-controlled 
promoter reading toward the operon containing the genes 
c4, icd and antl/2, there exists also a constitutive promoter 
expressing the same operon. Complete repression of Ant is 
achieved by the C4 RNA. This is one of the first antisense 
mechanisms ever described (25). The message of the c4 
gene is processed by RNase P (71). The result is a 77 bp anti- 
sense RNA, which folds into a cloverleaf structure (26). This 
cloverleaf RNA then has the ability to bind to complemen¬ 
tary sequences downstream in the same RNA message it 
has been processed from. This RNA-RNA interaction blocks 
the access of the ribosome to the Shine-Dalgarno (162) 
sequence of the icd gene. As the anti gene is translationally 
coupled to icd, the same interaction also blocks Anti expres¬ 
sion (26). However, the effect of C4 is not limited to transla¬ 
tion, but also interferes with the transcription of the operon. 
The DNA-dependent RNA polymerase transcribing the 
operon pauses at a p-dependent terminator located within 
the icd gene (18). If C4 is present and prevents simultaneous 
translation of the growing message, then transcription is 
terminated. In the absence of C4 the translating ribosome 
alters the structure of the nascent mRNA chain and thereby 
triggers the RNA polymerase to complete transcription of 
the entire operon (18). 

The icd gene not only acts as a mediator between c4 and 
ant, but also specifies a small 8.8 kDa toxic protein, which 
acts as a cell division inhibitor. led blocks division of the 
host cell until either lysogeny is successfully established or 
the entire developmental program of a lytic growth cycle is 
completed (148). 

imm T 

The corepressor protein, Lxc (formerly called Bof; 181), 
expressed from the immT locus (157, 185), enhances the 
repression exerted by the Cl repressor protein and thus 
promotes lysogenic growth. Lxc is a small, 7.8 kDa protein 
which binds to DNA-bound Cl repressor and in such a 
DNA-Cl-Lxc ternary complex increases the affinity of 
Cl for the operator sites (184). This activity of Lxc results 
in two antagonistic effects. As a consequence of the 
Cl-autoregulatory control loop (figure 24-4 and discussed 
below), Lxc lowers the expression of cl, thus lowering the 


equilibrium concentration of the Cl protein within the cell. 
Lower Cl concentrations would result in reduced repres¬ 
sion activity if this effect were not compensated for by 
the increased affinity of Cl for the operators. Whether the 
combined result of these two effects is a tighter repres¬ 
sion or a slight derepression varies between different 
Cl-regulated promoters, accounting for the very pleiotropic 
phenotype of lxc mutants (181,185). 

Maintenance of Lysogeny 

In order to maintain lysogeny the intracellular concentra¬ 
tion of the Cl repressor has to be buffered against fluctua¬ 
tions during growth. An efficient solution to this problem is 
an autoregulatory control loop as outlined in figure 24-4. 
One of the promoters expressing the cl gene is controlled by 
the weakest Cl-binding site found in the entire PI genome 
(198). In the presence of Lxc, moderate or even low concen¬ 
trations of Cl are sufficient to occupy this poor binding site, 
in addition to all other binding sites, thus switching off the 
expression of cl. Once the Cl concentration drops below a 
certain threshold level, this weak binding site will be the 
first to be clear of Cl, allowing de novo synthesis in order to 
replenish the Cl pool before any other Cl-controlled promo¬ 
ters get derepressed. The advantages of such a regulation are 
2-fold. First, any unnecessary accumulation of Cl in the cell 
is avoided, reducing the metabolic burden of a PI lysogen to 
its host. Second, the PI prophage retains the ability to react 
quickly to stimuli that trigger lytic growth, as only small 
amounts of antirepressor protein would be sufficient to inac¬ 
tivate the low amounts of Cl present during equilibrium 
conditions. 

Induction of Lytic Growth 

The physiological conditions of the host cell are continu¬ 
ously monitored by the PI prophage. Should changing 
circumstances disfavor the maintenance of lysogeny, a lytic 
growth cycle is initiated. Of the two antirepressor proteins 
Coi and Ant, only Ant can be expressed even in the presence 
of high concentrations of Cl and is thus able to overcome the 
Cl autoregulatory control loop. Conditions which change 
the overall RNase activity in the cell could impair the proces¬ 
sing of the C4 RNA by RNase P (71), or could reduce the stabi¬ 
lity of the C4 RNA, thus leading to the expression of Ant. The 
inactivation of Cl by Ant then leads to the expression of all 
Cl-regulated genes, including coi, triggering the switch 
from lysogenic to lytic growth. Such a model might reconcile 
the conflicting reports about the induction of PI lysogens by 
UV light or other DNA-damaging agents (198). As UV damage 
does not directly affect the intracellular RNase activity 
levels, it is most likely that such a treatment only indirectly 
induces PI, possibly depending on variations in the bac¬ 
terial host, as already suggested by Yarmolinsky and 
Sternberg (198). 
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Plasmid Maintenance 

The PI prophage does not integrate into the bacterial chro¬ 
mosome (102), but is maintained as a unit-copy plasmid 
within the cell. Mechanisms to increase plasmid stability, 
such as copy-number control, active partition, dimer resolu¬ 
tion, or plasmid addiction, are important for any low-copy 
number plasmid, but PI has the luxury of specifying all of 
them. Accordingly, the loss frequency of a PI prophage is as 
low as 10 s cured cells per generation (198). 

Replication Control 

In terms of understanding replication and its control, the PI 
plasmid is one of the best studied members of the family 
of iteron-containing plasmids (24). Figure 24-5 shows the 
organization of the basic R replicon. The rep A gene, encoding 
the 32.2 kDa phage-specific initiator protein, RepA, is 
flanked by the origin of plasmid replication, oriR, including 
the incompatibility control locus, incC, and a second incom¬ 
patibility locus, incA. The hallmark of the replicon is the 
presence of 14 short repeated sequences, called iterons, 
forming both incompatibility loci. The iterons represent the 
binding sites of the RepA protein and show a 19 bp consen¬ 
sus sequence of GATGTGTGCTGGAGGGAAA (159). Binding 
of RepA to the iterons in incC is required for initiation, 
whereas binding to the incA iterons serves to control the 
plasmid copy number. In the absence of incA, the copy 
number increases about 10-fold, but replication is still 
controlled due to incC (143). Thus incC plays a dual role in 
both allowing replication and preventing overreplication. 

Several host factors contribute to the initiation at oriR. 
Four GATC dam methylation sites within the origin itself 


oriR incC incA 



i 


i DnaJ, DnaK, and GrpE 



Replication 



Handcuffing 


Figure 24-5 Organization of the PI replication region. 

The figure is not drawn to scale. The 19 bp iteron sequences 
are indicated by small boxes. The arrows within the 
boxes indicate their directionality. An elongated triangle 
represents the promoter sequence expressing the repA gene. 


need to be methylated (1, 2, 21). The methylation state of the 
origin was proposed to be negatively regulated by the host 
function, SeqA (22, 125), which binds to hemimethylated 
dam sites and sequesters them from the methylase (23). 
The melting of the DNA double strand at oriR occurs via the 
concerted action of RepA and the host factors, DnaA and HU 
(136,190), thereby providing access for the host replication 
machinery (136,142). 

The repA promoter is located within incC (figure 24-5) 
and is repressed upon RepA binding to the iterons (138). 
The expression of RepA is thus autoregulated. Freshly 
synthesized RepA protein apparently aggregates into 
dimers, but as only monomers show DNA-binding activity, 
there is a need for chaperones such as DnaK, DnaJ, and 
GrpE to convert the dimers into active monomers (35, 167, 
168,191). As RepA shows very little cooperativity while bind¬ 
ing to incC (42), it is likely that the iterons located in incA 
will be at least partially occupied by RepA monomers before 
productive initiation occurs (figure 24-5). Thus incA could 
reduce copy number by simply titrating RepA. However, as 
an oversupply of RepA from artificial sources poorly over¬ 
comes the incA inhibition (24), a mechanism other than 
titration has to operate. 

Such a mechanism is believed to be the pairing of origins 
(handcuffing) via bound RepA as illustrated in figure 24-5 
(24, 137, 143). Handcuffed plasmid molecules are thought 
to be refractory to reinitiation due to steric hindrance. 
How handcuffing is reversed, other than by the separation 
of plasmid copies during partition, is not known, though 
an increase in RepA concentration has been shown to over¬ 
come a limited decrease in copy number, due to excess 
iterons (138). In summary, the copy number of PI appears 
to be controlled by the availability of active RepA initi¬ 
ator protein, which promotes replication, but a continued 
increase in copy number is prevented by handcuffing (143). 

Plasmid Partition 

Replication control without active plasmid partition 
would contribute little to the stability of a low-copy-number 
plasmid. In PI the replication and partition loci are phy¬ 
sically linked (197). This circumstance, and the possibility 
that replication might be used to drive the partitioning of 
sibling plasmids, have suggested that the two processes are 
inseparable. However, this appears not to be the case, as 
evidence for active partitioning of unreplicated plasmids 
has been obtained (183). The PI partition module is com¬ 
posed of two genes, parA and parB, and the centromere-like 
site parS (3). Our general understanding of the PI partition 
process increased dramatically in the last decade due to 
major efforts in the laboratories led by Stuart J. Austin, 
Barbara E. Funnell, and Michael B.Yarmolinsky. 

A minimal sequence of only 22 bp has been shown to 
have some partition activity (130), but a more complete parS 
sequence is approximately 94 bp in length and contains an 
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asymmetrically located binding site for the integration host 
factor, LHF (182), which is flanked by highly conserved 
hexamer and heptamer boxes (38, 39, 52, 53). These boxes, 
with the consensus sequences TCGCCA and ATTTCA c / a , 
respectively, are the binding sites for the 37.4 kDa ParB 
protein (54, 123, 178). In a series of elegant experiments, 
some exploiting the subtle but distinct differences between 
the two partition systems of the closely related bacterio¬ 
phages PI and P7 (198), not only the species specificity but 
also the topology of the partition complex were worked out 
(53, 54, 74-77). As outlined in figure 24-6, ParB, in associa¬ 
tion with IHF, forms a high-affinity protein-DNA complex at 
parS, with the parS DNA wrapped around a core formed by 
the two proteins (20, 55,179). Recent evidence demonstrated 
that ParB is able to pair partition sites (46), consistent with 
longstanding ideas about the partition mechanism (141). 

The second partition protein, ParA (44.3 kDa), is an 
ATPase (36,40,41). In a complex with ADP, ParA exerts nega¬ 
tive autoregulatory control over the parAB operon (36, 51,77). 
Bound to ATP, ParA joins the partition complex at parS, via 
protein-protein interactions with ParB, and plays a direct 
and essential role in the partitioning process (19, 41). Both 
repressor and ATPase activities of ParA are stimulated in 
the presence of ParB (37). 

The complex at parS also serves as a nucleation site for 
the polymerization of additional ParB molecules along the 



Figure 24-6 Partition complex at parS. The integration host 
factor, IHF (182), binds centrally within parS and introduces 
a strong bend. ParB is able to coordinately contact both 
conserved hexamer (gray boxes) and heptamer (black boxes) 
sequences, located in both arms of parS. ParA, complexed 
with ATP, then joins this core complex via interactions 
with ParB. For further details see text. Modified from 
Bouet et al. (20). 


DNA flanking it on either side (149). Genes located in the 
vicinity of parS are transcriptionally silenced when they 
are covered by ParB (149, 195). However, how the silencing 
activity or any of the other interactions of the partition 
proteins at parS contribute to the actual equipartition of 
plasmid copies remains elusive. This holds true in spite of 
the latest localization studies, which revealed that PI 
replication occurs in mid-cell and plasmid copies are then 
rapidly transported to the quarter and three quarter posi¬ 
tions (61, 62). These positions match the intracellular 
localization of ParB (50) and, among other results, indi¬ 
cate that there are clear-cut differences between the parti¬ 
tion systems of PI and its host E. coli (56,61). 

Dimer Resolution 

Homologous recombination between sister plasmids can 
generate plasmid dimers, which would interfere with the 
partition machinery. Such dimers carry directly repeated 
lox sites and are efficiently resolved by the Cre recombinase 
(4). In vitro data implied that there might be an equilibrium 
between dimer formation and resolution mediated by Cre. 
However, in vivo results showed an almost exclusive prefer¬ 
ence for the dimer resolution reaction (5). In analogy to the 
cer-Xe r dimer resolution system of ColEl plasmids (177), the 
involvement of host factors, which favor the formation of a 
resolution complex (63, 88), might account for the observed 
differences between in vivo and In vitro results. 

Plasmid Addiction 

While the above three mechanisms attempt to avoid the loss 
of the PI prophage, plasmid addiction is a way to retaliate 
should these measures fail. The PI addiction module (118) 
contains two small genes, phd and doc, expressed from an 
autoregulated promoter (58, 127, 128). The 13.6 kDa Doc 
protein is a potent cell toxin, inhibiting the bacterial ribo¬ 
some (78). In the presence of the PI prophage the toxicity 
of Doc is neutralized by the 8.1 kDa antidote protein, Phd. 
Through differential expression of the phd/doc operon an 
excess of Phd over Doc is produced (118), which favors the 
formation of an inert heterotrimeric Phd(2)Doc complex 
(59). The antidote protein Phd is labile, as it is the substrate 
of the cellular ClpXP protease (121). Upon loss of the PI 
prophage, ClpXP will degrade the residual antidote protein, 
leaving the stable toxin to kill plasmid-free cells (196). 

Lytic Growth 

Lytic Replication 

The inactivation of the Cl repressor, either after infection 
or upon induction of a PI lysogen, marks the onset of lytic 
growth. An entire set of genes, which is normally repressed 
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by Cl, is now expressed. Several of these so-called early 
functions play a role during lytic DNA replication and asso¬ 
ciated processes. Initiation of lytic replication occurs at 
oriL, mediated by the RepL initiator protein (30, 70, 170). 
Replication is bidirectional (29), initially of the theta type, 
but then predominantly of the sigma type during later 
stages of the infection cycle (28). The repL gene is coex¬ 
pressed with kilA, coding for a cell division inhibitor similar 
to icd (148). Thus, bacterial cell division is blocked through¬ 
out lytic growth. The expression of the kilA/repL operon is 
controlled not only by Cl but also by an antisense regulatory 
mechanism to avoid inadvertent expression during lyso¬ 
genic growth (79). The progress of the lytic replication fork 
is independent of the host DnaB protein (73), as PI specifies 
a DnaB analog called Ban (85, 180). Other associated pro¬ 
teins, encoded by PI, include a Dam methylase (27, 33); 
a function, Ref (111, 126, 193, 194), that stimulates homo¬ 
logous recombination; a homolog of the DNA polymerase 
subunit, t (124): HumD, a functional homolog of the E. coli 
UmuD' protein involved in SOS mutagenesis (131); and a 
single-stranded DNA-binding protein (12,113). 

Activation of Late Transcription 

During lytic growth only two major regulatory steps, early 
and late transcription, have been described for bacterio¬ 
phage PI (69, 120). Late genes, which encode the building 
blocks for the phage particle, as well as proteins involved 
in particle maturation, lysis control, and lysis (65, 96, 117, 
158), are not expressed from standard E. coli promoters. 
Rather they are controlled by phage-specific late promoter 
sequences (65,115), which share the —10 TATA AT consensus 
with E. coli promoters (72) but lack homology to the — 35 
TTGACA box (65). Conserved among the PI late promoters is 
a sequence called —22 box, with the consensus ACAAGT- 
TACTT, located 4 bp upstream of the —10 box (115). Such 
promoter sequences are not recognized per se by the E. coli 
ct / 0 -RNA polymerase, but need to be activated during lytic 
growth. A single phage-specific protein, Lpa (formerly 
called gplO), was identified to be essential for this activa¬ 
tion (114). The expression of Lpa is controlled by Cl (120), 
defining it as an early function. Latest footprinting experi¬ 
ments showed that Lpa directly binds the —22 box in order 
to activate late transcription (69). Further experiments 
showed also that the RNA polymerase-associated host 
factor, SspA, is required for PI late transcription (69, 103, 
192). Exactly how Lpa and SspA cooperate in redirecting 
the host RNA polymerase towards PI late promoter 
sequences is not clear. 

Particle Morphogenesis 

While most morphogenetic functions have been identified 
genetically, very little is known about the general assembly 
process of the PI particle. Two internal head proteins, DarA 


and DarB, play a role in head morphogenesis (101), in 
addition to their involvement in defense against type I 
restriction described above. A protease, which is required 
for the proper maturation of phage heads (175), and a particle 
maturation function, mat (formerly called gene 1), which 
shows a general defect in the production of infective par¬ 
ticles (188), have been described (187). Unexpectedly, it was 
found that the mat gene is expressed both early and late 
during lytic growth (117), but the exact role that the Mat 
protein plays during particle maturation remains unclear. 

Packaging 

The initial packaging reaction starts at a specific cleavage 
site, pac (171). The pacase enzyme, formed by a heterodimer 
of the two proteins PacA (45.2 kDa) and PacB (55.6 kDa) (164, 
166), binds and cleaves fully methylated pac sites (172) in 
association with the host factors, IHF and HU (165). The 
pacase then remains associated with one end of the cleaved 
pac site, interacts with a prohead, and mediates the incor¬ 
poration of the phage DNA into the head. Subsequent pac 
sites present in the concatemeric DNA substrates, products 
of the sigma-type replication forks during the later stages of 
a lytic growth cycle (28), are not cleaved by the pacase. Such 
a processive “headful" packaging mechanism results in PI 
heads being filled with DNA molecules larger than genome- 
size, with approximately 10% terminal redundancy (10,198). 
The signal to ignore subsequent pac sites might be related to 
their methylation state (172) or, alternatively, it could be 
that the pacase is inhibited while it is associated with a 
prohead in the process of being filled. Evidence for the latter 
idea stems from the result that multiple rounds of pac 
cleavage are initiated when the packaged DNA molecule(s) 
does not have the required size to completely fill the phage 
head (31). 

Transduction 

The mechanism of formation of generalized transducing 
particles did not attract much attention, even though phage 
Pi’s ability to randomly transduce chromosomal markers is 
one of its most prominent features (198). PI mutants with 
increased transduction frequencies have been isolated (96, 
101, 189). Some contained mutations that were mapped to 
the dar operon and were thus linked to the head assembly 
process (96). Theoretically, mutations affecting either the 
availability of chromosomal-DNA free ends, the processivity 
of the packaging reaction, or the stability of transduced 
DNA molecules in the recipient cell all could result in an 
increased transduction frequency (198). Whether the respec¬ 
tive protein(s) affects processivity via crosstalk between the 
head and the packaging machinery (165), or is associated 
with free ends of packaged DNA and thus stabilizes trans¬ 
duced DNA molecules, remains to be determined. 
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Cell Lysis 

At the end of a lytic growth cycle newly formed progeny 
particles start to accumulate in the cytoplasm of the host. 
To release these progenies from their confinement, PI speci¬ 
fies a 20.3 kDa, classical T4-type lysozyme (145, 200) with 
mureolytic activity (158). To secure the correct timing of 
cell lysis, the access of the lysozyme to the peptidoglycan 
layer is controlled by a holin, LydA (11.4 kDa), and a 
holin inhibitor, LydB (17.1 kDa) (96,101,158; chapter 10). The 
lysozyme cleaves the peptidoglycan strands, until the cell 
wall is no longer able to withstand the osmotic pressure 
within the cell. Lysis of the host cell terminates the lytic 
growth cycle and a burst of around 100-200 infective 
phage particles (134), like the one shown in figure 24-1, is 
released. 


Outlook 

The writing of this chapter marks the fiftieth anniversary of 
bacteriophage PI. Having reached such a venerable age, PI 
research shows a few signs of slowing down, hand in hand 
with a general decrease in the interest in basic prokaryotic 
research. However, PI genes continue to be prominent in 
the study of at least one major biological problem: the 
mechanism of partitioning (46). Also, many other aspects of 
PI biology still pose unsolved puzzles, worthy of scientific 
scrutiny, and the recently completed nucleotide sequence of 
the PI genome (124) might add a few more challenging 
problems to the list. On several occasions Pl-derived systems 
also managed to transcend the limitations of prokaryotic 
research. A Pl-based in vitro packaging system (169) is now 
widely used to clone large DNA fragments from almost any 
source (32, 161), and the lox-cre recombination system has 
turned out to be invaluable for manipulating the genomes 
of higher organisms (16, 156). Along these lines, PI might 
well prove useful in yet other, maybe even medically relevant 
applications. A rational approach to phage therapy (chapter 
48), for example, could be based on the ability of bacterio¬ 
phage PI to infect and destroy a broad range of pathogenic 
bacteria. 
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Origin of Phage P2 and P2-Like phages 

Bacteriophage P2 was isolated by G. Bertani, together with 
two other prophages, from the Lisbonne and Carrere strain 
of Escherichia coli (the oldest known lysogen) (8). The three 
phages were named PI, P2, and P3, based on different 
plaque morphology, and their partial serological crossreac¬ 
tivity. Phage PI was later shown by E. Lennox to be a general 
transducing phage and to belong to a group of phages that 
differs from P2 in many respects (82) (see chapter 24). 
Phage P3 has not been studied as extensively as the other 
two. Since that time, a number of temperate phages that 
grow on E. coli have been isolated that share some, but not 
all, characteristics with phage P2. These P2-like phages are 
similar in traits such as morphology and host range, are 
serologically unrelated to phage X and not inducible by UV 
light. Of these P2-like coliphages P2 and 186 are the best 
characterized, and they have in addition the capacity to 
function as helpers for phage/plasmid P4. For a list of other 
known P2-like E. coli phages, see reviews by Bertani and 
Bertani (11) and Bertani and Six (12), and for phage/plasmid 
P4 see chapter 26. 

P2-Like Phages in E. coli and Other Bacteria 

P2-like prophages seem to be quite common in E. coli. At 
least 26% of the strains in the E. coli reference collection 
(ECOR) contain a P2-like prophage: hybridization with a 
32 P-labeled P2 DNA probe against chromosomal DNA 
resulted in a strong signal to 19 strains and a weak signal to 
two more strains of the collection (109). The 72 strains in the 
ECOR collection have been selected from a set of 2600 
isolates from a variety of hosts and from different parts of 
the world (111). Thus, the ECOR collection is expected to 
contain a large part of the genetic variation in E. coli. 

P2-like phages also seem to be distributed among other 
proteobacteria of the gamma subgroup. The genomes of 
phages HP1 and HP2 in Haemophilus influenzae (35, 151), 
phage OCTX in Pseudomonas aeruginosa (103), and phage 


K139 in Vibrio cholerae (105) have all been sequenced and 
shown to be P2-like with respect to genome organization 
as well as nucleotide sequence. Phages PSP 3 in Salmonella 
potsdam (20) and Sop E/E in Salmonella typhimurium (100) 
are also P2-like, but they are not yet fully sequenced. Bacter¬ 
ial genome sequencing projects have revealed additional 
P2-like prophages, for example Spl3 in the enterohemor- 
rhagic E. coli 0157:H7 isolated during the Sakai outbreak in 
Japan (55), and Fels-2 from Salmonella typhimurium LT2 (99). 

Scope of this Chapter 

Since the last comprehensive review on phage P2 and P2- 
like phages, written in 1988 (12), new information concern¬ 
ing many aspects of P2's gene regulation. DNA replication, 
head morphogenesis, and site-specific recombination has 
accumulated. The current understanding of these topics 
will be summarized here. Since several P2-like phages have 
been completely sequenced recently, it has become possible 
to compare whole genomes and shed some light on the 
evolution of P2-like phages. But this has also revealed large 
differences, which makes classification of phages as belong¬ 
ing to a specific “family” or “group” problematic. In some 
cases, phages have similar genes encoding capsid proteins, 
while their DNA replication machinery seems to differ, for 
example phage P2 compared with phage OCTX. In other 
cases, phages share replication mechanism with phage P2, 
like the virulent phage 0X174 and phage PM2, but nothing 
more. It is too early to decide which phage characters should 
be used to define a specific family since too few P2-like 
phages have been fully sequenced. There functional gene 
groups are compared rather than whole phages, and we use 
the term “P2-like” for phages that share some, but not all, 
properties with P2. 

The genetic interactions between phage P2 and the defec¬ 
tive satellite phage, or plasmid, P4 is discussed in detail 
in chapter 26, and only certain key features relevant for 
the P2 helper are discussed in this chapter. 
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Phage Structure and Assembly 

A P2-like phage has an icosahedral head with a diameter of 
about 60 nm, containing a linear double-stranded DNA 
molecule of about 30-35 kb with cohesive ends 19 nucleo¬ 
tides long, and a 135 nm long straight tail with a contractile 
sheath. The tail ends with a baseplate carrying six tail fibers 
and a spike. Based on their morphology, P2-like phages 
are taxonomically classified, together with phage PI and 
the T-even phages, as members of the Myoviridae family in 
the order Caudovirales (1). 

The Capsid 

The major head protein of phage P2 is encoded by gene 
N, and the mature capsid contains mainly the processed 
form of the N protein, N*, which lacks 31 amino acids at the 
N-terminus (126). The structure of the P2 head has been 
analyzed by image reconstructions from cryo-electron 
micrographs with a resolution of 4.5 nm (32). The N protein 
seems to have two domains, one domain comprising the 
capsomer and the other forming trimeric connections 
between the capsomers. The capsomers are assembled in 
a T = 7 icosahedral symmetry with 12 pentamers and 60 
hexamers. The pentamers protrude more from the capsid 
surface than the hexamers. 

P2 head assembly has been extensively studied, since the 
capsid protein N can assemble into two different sizes, 60 or 
45 nm, depending on the scaffolding protein used (2, 97, 98). 
In a normal P2 infection, only large capsids are formed, 
but during a mixed infection with satellite phage P4, small 
capsids with a T = 4 symmetry are also formed. The small 
capsids can only package the small 11.52 kb P4 genome, 
thereby excluding the large 33.6 kb P2 genome. Cryo- 
electron microscopy studies of procapsid-like structures 
isolated from cells expressing the N protein have shown 
that closed shells of both sizes are formed, although with 
a low efficiency. Coproduction of P2 N protein and the P2 
scaffolding protein, 0, leads to a more efficient assembly 
of large capsid structures, but when the phage P4 scaffold¬ 
ing protein, Sid, is also expressed, the assembly of small 
capsid structures prevails (98). The phage P2 0 protein is 
presumed to form an internal scaffold, while the P4 Sid 
protein forms an external scaffold (31, 96). The P2 0 and the 
P4 Sid proteins are both required for formation of infectious 
P4 particles. The target for the phage P4 scaffolding protein, 
Sid, has been identified by the isolation of phage P2 mutants 
(P2 sir mutants for sid responsiveness) that do not respond 
to the action of the P4 Sid protein. The sir mutations have 
been mapped to a 38 codon long segment in the middle 
of gene N (137). 

The procapsid contains the full-length N protein and 
the scaffold protein, 0, which implies that processing of 
the N and 0 proteins occurs after assembly. Thus, the 
mature capsid contains the processed form of N protein, 


called N*, as well as the 17 kDa processed form of the 0 
protein (97,126). 

The products of genes Q, P, and M are required for packa¬ 
ging the DNA into the head and for conversion of the 
prohead to a capsid. The 0 protein constitutes the portal 
protein (see below), while the M and P proteins constitute 
the terminase activity. The packaging substrate consists 
of closed monomeric DNA circles that are cleaved at the cos 
site generating the 19-base long single-stranded cohesive 
ends. The M protein, which has a DNA-binding activity, has 
been suggested to contain the endonuclease activity, while 
the P protein contains a DNA-dependent ATPase activity 
that can account for the ATP requirement of the terminase 
activity (16,17). 

An alignment of the cos regions of phages P2 and 
P4 identified a region of 55 bp, including the common 19 bp 
long cos sequence, with only three mismatches. A conserved 
inverted repeat, located at one side of the cos sequen¬ 
ces, might be the recognition sequence for the packaging 
protein (161). 

The Connector 

The connector or portal that joins the phage head and 
tail has a double-disk structure. One disk is hidden by the 
head and is detectable only after disruption of the virions 
(127). The phage P2 gene Q encodes the connector protein, 
which is present in the virion in a processed form that 
lacks the 24 N-terminal amino acids (86,127). Purified full- 
length O protein assembles into connector-like structures. 
Image reconstructions, based on two-dimensional crystal¬ 
line layers of purified full-length 0 proteins, have shown a 
toroid structure with a central mass surrounding a channel 
with a diameter of about 2 nm. The central mass has 12 
protrusions that suggest a 12-fold symmetry (125). The P2 
connector thus seems to have a design similar to other 
phage connectors even though no similarities at the DNA 
or protein levels have been detected (86,95,145,146). 

The Tail 

There are 12 genes known to be involved in tail production 
in the P2 genome, of which seven have been identified as 
part of the tail structure (81, 84). 

Proteins F! and F n make up 60% and 30% of the tail 
respectively, and have been proposed to constitute the 
sheath and the tube respectively (81,142). 

Mutations in tail genes R and S will still result in tail 
structures. Infections with phage P2 R mutants result in 
giant naked tail tubes and extended tails, while the tails of 
the S mutants look normal but are functionally inactive 
(81). Due to these phenotypes, the S and R proteins have 
been suggested to be involved in tail completion (85). The 
T protein is present in the tail structure with 5 ±2 copies 
per tail. Its function is not known, but due to its large size 
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and predicted a-helical structure, it has been suggested to 
be the ruler protein (85). 

The tail is terminated by a baseplate, which carries six 
tail fibers and a small protruding spike. Electron microscopy 
of phage particles, using antibodies raised against proteins V 
or J, has shown that the V protein forms the spike and that 
the J protein is part of the baseplate or the proximal part of 
the tail fiber (48). The W protein has been inferred to be part 
of the base plate since it is homologous to the gene 25 protein 
of phage T4. The H protein probably constitutes the distal 
part of the tail fiber since it contains regions similar to tail 
fiber proteins of other unrelated phages with the same host 
range (47). The assembly of the tail fiber is probably mediated 
by the G protein. 

Comparison with Other P2-Like Phages 

The structural genes of P2-like phages are arranged simi¬ 
larly on the genetic map and many of the encoded proteins 
are very similar in composition and size (116,153) (table 25-1). 
Clearly the genes required for capsid formation are very 
well conserved in all six fully sequenced P2-like phages, 
while the tail genes fall into two groups based on sequence 
similarities and gene organization. One group contains 
phages P2, 186, and 4>CTX, and the other contains phages 
HP1, HP2, and K139. A possible explanation for the differ¬ 
ence in tail gene organization is that all HP1 tail genes 
are transcribed from a single promoter, and genes encod¬ 
ing proteins required in large quantities must be located 
close to the promoter (35). In general, P2-like tail genes are 
found in both related and unrelated phages, as well as in 
bacteria where they can function as bacteriocins (35, 92, 
104,115). 

The Lytic Cycle 

P2-like phages are temperate, that is they can grow lytically 
as well as forming lysogens. During lytic growth, gene 
expression is regulated over time. Early transcription is 
initiated immediately after infection, and requires only 
the host ct /(i RNA polymerase, leading to expression of the 
genes required for DNA replication. Once DNA replication 
has been initiated and the transcriptional activator needed 
for activation of the late promoter has accumulated, late 
gene transcription is initiated. The capsid and tail are assem¬ 
bled by two independent pathways. After DNA packaging the 
tails are added to the filled capsids. As is true for all tailed 
phages, the phage P2 lytic cycle ends with phage-induced 
cell lysis (chapter 10). 

Early Transcription 

The early operons of phages P2,186, HP1, and K139 contain 
9, 12, 11, and 11 genes, respectively (figure 25-1). The first 


gene of each operon, designated cox in P2, HP1, and K139, 
but apl in phage 186, encodes the repressor of the lysogenic 
promoter. These genes are thus functionally equivalent to X 
cro, but unlike the X Cro potein, P2 Cox and 186 Apl have 
been shown not to be essential for lytic growth even though 
at high concentrations they negatively reduce early lytic 
transcription (119,129). Another common gene in the early 
transcript is the A gene, designated rep in HP1 and K139, 
that is required for initiation of rolling circle replication (see 
DNA Replication below). It should be noted that phage <I>CTX 
does not contain a gene homologous to the P2 A gene and 
may thus initiate DNA replication by some other mecha¬ 
nism. For phage 186, and possibly phage HP1, the initiator is 
the only phage protein required for phage DNA replication 
(136). This is in contrast to phage P2 where gene B encodes 
an additional protein, a helicase loader (112). Phage K139 
may also encode a P2 B-like protein, since the product of 
ORF3 has 28% identity to P2 B (72). The early transcript 
of phage 186 contains a ell gene, which encodes a trans¬ 
criptional activator of the phage 186 P E promoter required 
for establishment of lysogeny (77, 106). The phage 186 P E 
promoter, located between genes apl and ell, controls expres¬ 
sion of gene cl. Phage K139 has a gene encoding a protein 
showing 27% identity to phage 186 ell protein and it is 
therefore believed to have a similar function (105). Most of 
the remaining genes have unknown functions, and some— 
phage P2 ORF78, ORF80, and ORF82, and phage 186 dhr 
and fil genes (92, 123, 129)—are lethal to the host. The dhr 
and fil gene products have been shown to inhibit host DNA 
replication and cell division, respectively. Phage P2 ORF82 
and ORF83 are similar to phage 186 ORF80 and ORF83, 
but in phage 186 they are separated by ORF81. Proteins 
with 30-40% identity to that produced by the P2 ORF82 
can be found in plasmids (E. coli F plasmids), and in phages 
(€>CTX, 933 W), and in bacteria (E. coli, H. influenzae, Yersinia 
pestis, and Treponema pallidum), and it has been suggested 
to be a dnaK suppressor (33). 

Early lytic transcription has been analyzed only in phage 
186. It is terminated at £R1, a Rho-independent terminator 
that shows 70% efficiency in vitro (124). The transcript is 
extended by some unknown antitermination mechanism 
that does not require any known phage 186 protein. The 
extended transcript is processed by RNaselll which cleaves 
at a site within the fil-dhr region (28). 

Many of the genes in the early operons of phage P2 
and phage 186 have start and stop codons that overlap, indi¬ 
cating translational coupling. This has also been shown in 
phage 186, where an amber mutation in ORF84, located 
proximal to the A gene, prevents expression of the A initiator 
protein, which lacks a ribosomal binding site (136). In the 
case of phage P2, ORF83 is located proximal to the A gene, 
and the two genes overlap by three amino acids. Since phage 
P2 gene A also seems to lack a properly spaced ribosome¬ 
binding site, the expression of A is very low (80; A. Ahlgren 
Berg, personal communication). 



Table 25-1 Late Genes of P2-Like Phages 



P2 


186 

cDCTX 

HP1/HP2 

K139 

Similar genes 

Function 

Gene 

Codons 

Gene 

Codons 

Gene 

Codons 

Gene 

Codons 

Gene 

Codons 





ctx 

286 






Toxin 

Q 

344 

Orf 2 

340 

Q 

350 

Orf 15 

345/345 

Orf 15 

348 

Ec67 Orf 5 

Capsid portal protein 

- 


W 

248 









P 

590 

Orf 12 

589 

P 

594 

Orf 16 

607/607 

Orf 16 

605 

Ec67 Orf 6 

Large terminase subunit 

0 

284 

V 

284 

O 

273 

Orf 17 

298/297 

Orf 17 

299 

<5Hs Orfl 

Capsid scaffold 

N 

357 

T 

355 

N 

338 

Orf 18 

336/336 

Orf 18 

341 

<DHs Orf2 

Major capsid precursor 

M 

247 

R 

249 

M 

235 

Orf 19 

281/279 

Orf 19 

238 

<DHs Orf3 

Small terminase subunit 

L 

169 

Q 

168 

L 

153 

Orf 20 

150/150 

Orf 20 

153 

•this Orf4 

Capsid completion 





Orf 7 

65 







X 

67 

Orf 23 

67 

X 

69 





Pyocin R Orf 22 

Tail 











<DHs Orf 5 


Y 

93 

Orf 24 

98 








Lysis — holin 







Orf 21 

166/161 

Orf 21 

162 











Orf 23 

53 









Orf 23 

376/376 

Orf 24 

369 


Tail sheath 







Orf 24 

150/144 

Orf 25 

152 


Tail tube 









Orf 26 

69 


P2 Orf 82 









Orf 27 

75 









hoi 

78/78 



<DHs Orf 6 

Lysis — holin 

K 

165 

P 

165 



lys 

186/179 

Orf 28 

195 

<5Hs Orf 7, X R 

Lysis — endolysin 

lysA 

141 










Lysis — timing 





Orf 9 

117 











Orf 10 

90 











Orf 11 

268 







lysB 

141 

Orf27 

137 

lysB 

153 






Lysis — timing 

Orf 


Orf 28 

96 

Orf 12.5 

89 







R 

155 

N 

155 

R 

178 





<DHs Orf 10 

Tail completion 

S 

150 

0 

149 

S 

156 

Orf 22 

227/231 

Orf 22 

166 

•this Orf 11 

Tail completion 





Orf 15 

242 







Orf 30 

261 











V 

211 

Orf 32 

213 

V 

190 





Pyocin R Orf 11 

Tail spike 

W 

115 

M 

115 

W 

114 





Pyocin R Orf 12 

Baseplate 

1 

302 

L 

302 

1 

304 





Pyocin R Orf 13 

Baseplate/tail fiber 

1 

176 

Orf 38 

176 

1 

178 





Pyocin R Orf 14 

Tail 







Orf 25 

115/115 













Orf 29 

116 


Signal peptide 







Orf 26 

102/102 

Orf 30 

192 


Signal peptide 







Orf 28 

111/111 

Orf 32 

110 









Orf 29 

393/382 

Orf 33 

399 









Orf 30 

174/175 

Orf 34 

219 



H 

669 

K 

462 

H 

762 

Orf 31 

925/910 

Orf 35 

620 

Pyocin R Orf 15 

Tail fiber 

G 

175 

Orf 45 

166 



Orf 32 

200/210 



Mu gene U 

Tail fiber assembly 

Z/fun 

528 










Lysogenic conversion 









Orf 3 6 

174 


Tail fiber assembly 





Orf 21 

148 







FI 

396 

1 

392 

FI 

391 





Pyocin R Orf 17 

Tail sheath 

Fll 

172 

1 

173 

Fll 

171 





Pyocin R Orf 18 

Tail tube 

E+E 

142 

H 

162 

E 

108 







E 

91 

Orf 52 

58 

E' 

39 







T 

815 

G 

812 

T 

904 

Orf 27 

689/709 

Orf 31 

605 


Tail 

U 

159 

F 

161 

U 

146 





Pyocin R Orf21 

Tail 

D 

387 

D 

389 

D 

424 





Pyocin R Orf23 

Tail 





Orf 28 

295 











Orf 29 

170 











Orf 30 

116 











Orf 31 

38 











Orf 32 

60 











Orf 33 

156 













Orf 33 

258/255 

Orf 37 

297 









Orf 34 

187/186 

Orf 38 

157 









Orf 35 

533/533 

Orf 39 

543 



ogr 

72 

B 

72 

ogr 

97 






Late promoter activator 
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1,000 bp 


Figure 25-1 Schematic drawing of the regions between attP and cos, or the equivalent region, of phage P2 and some P2-like 
phages. The arrows indicate direction of transcription, and the small circle the respective promoter. The genes, or open 
reading frames (ORFs), are indicated above or below the genes and coding parts are indicated by boxes. Homologous genes, 
or ORFs, are indicated by identical fillings of the respective box. 

DNA Re Ucation cycle is directly initiated. In phage P2, the cleavage and join- 

^ ing reactions are mediated by two tyrosine residues located 

DNA replication has been studied only in phages P2 and 186, at the catalytic site of the A protein, Tyr-450 and Tyr-454 
which both replicate via a modified rolling-circle mecha- (90, 112). This DNA replication mechanism is thus very 
nism that generates double-stranded monomeric circles, similar to the well-studied <5X174 system, but in contrast 
DNA replication is initiated by the A protein, which catalyzes to the <5X174 system, the two tyrosine residues do not have 
a single-stranded cut at the origin. Upon cleavage, the A equivalent roles during initiation of replication. The Tyr-454 
protein gets covalently linked to the 5' end of the cleaved residue of the phage P2 A protein promotes the initial clea- 
strand, while the 3'-OH end will function as a primer for vage reaction, after which the two tyrosine residues act in a 
DNA polymerization (112). After one round of replication, flip-flop mechanism as has been suggested for phage 0X174 
when the origin sequence is regenerated, the covalently (149). 

linked A protein initiates a series of transesterification reac- The origin has been shown to be located within the 
tions which lead to cleavage of the newly synthesized origin coding part of the phage P2 A gene (91), and similar sequen- 
sequence and, at the same time, joining of the displaced ces can be found within the coding part of initiator proteins 
old strand (figure 25-2). In this way, the A protein becomes of phages 186, HP1, HP2, and K139 (figure 25-2). The impor- 
covalently linked to the 5' end of the new strand and another tance of the bases at the cleavage site in the phage P2 origin 















370 


PART IV: INDIVIDUAL TAILED PHAGES 



It 


P2 

186 

K139 

HP1/HP2 

Consensus 


CGCCGCGCCTCG/GAGTCCTGTCAATAACTGTGGAA 
TGCGCCCTCTCG/GAGTTCTGTCAATAACTGTACGG 
TGCCGCCTCTCG/GAGTTCTGTCAATAACTGTACGG 
CAGTGCGCCTTG/GACTTGTGTCAGTAACTGTAACC 
gC gC CTcG/GAgTtcTGTCAaTAACTGTa 


Figure 25-2 Phage P2 DNA replication. A: Hypothetical model of the steps during P2 DNA replication. Initiation: The A 
protein is synthesized, cleaves the ori-region in c/s, and gets covalently linked to the 5' end of the cleaved strand. The A 
protein recruits the P2 B/DnaB helicase complex, REP, and all other host factors reguired for DNA replication to the origin. 
Elongation: The free 3'-OH end is used as primer for the DNA polymerase III and the displacement of the old strand, which 
contains the A protein at the 5' end, is mediated by the REP helicase. The DnaB protein is reguired for lagging strand 
synthesis, possibly for recruitment of the DnaG primase. Termination: After one round of replication, when ori has been 
regenerated, the A protein cleaves the new or/ and joins the old strand, which is released as a double stranded covalent- 
closed circle after lagging strand is finished. Since the A protein remains covalently linked to the 5' end of the new strand, 
a new round of replication can be initiated. B: A comparison of the P2 origin of replication with the hypothesized origins of 
some P2-like phages. The cleavage site in P2 origin is indicated by a slash, and it has been determined experimentally, while 
the others are hypothesized. The nucleotides reguired for cleavage in P2 are in bold typeface. In the consensus seguence, 
nucleotides common to all five phages are capitalized. 


has been analyzed, and two nucleotides are critical for in 
vitro cleavage: G at position —1 and T at position —3 (90). 
These two nucleotides are also conserved in the presumed 
origins of phages 186, K139, and HP1. Since the origin is 
located within the coding part of the A gene, this conserva¬ 
tion might simply reflect essential amino acids in the A 
protein. However, the conserved T residue is located in the 
third position of the codon, and codes for a Pro in the initia¬ 
tion proteins of phages P2, HP1, and HP2 but a Ser in phages 
186 and K139. Furthermore, the G residue is located in 
the second position and codes for an Arg in phages P2,186, 
and K139 but Try in phages HP1 and HP2 (figure 25-3). 
The sequence on the 5' side of the cleavage site is better 
conserved than that on the 3' side, and might constitute the 


recognition sequence for the initiator proteins. The recogni¬ 
tion sequence of the <J>X174 A protein is located at a similar 
position, but there is no significant sequence identity of this 
region to equivalent regions in the P2-like phages (4). 

The A/Rep proteins of phages P2, 186, HP1, HP2, and 
K139 belong to a large family of proteins that initiate 
rolling-circle replication. This mode of replication is found 
among small phages, such as <PX174 and M13, in a number 
of small plasmids in bacteria as well as in archaebacteria, 
and in plant and animal viruses. Conjugative plasmids have 
also been found to use a similar initiation mechanism 
for mobilization of their DNA. A comparative analysis of 
the initiation proteins has classified them into two major 
groups—the replication (Rep) and the mobilization (Mob) 
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K139 rep I.AQS 
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HP2 rep |qsmweqqrn NNLTAKNAHM 
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HP1 rep PLQLELFATN PVDFEFIEQK 
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HP1 rep VKDEMQQAVQ FSTV0TREEL AKHYNELHYS GFHFRLLGTQ QKQKQLPFYL 

HP2 rep VKDEMQQAVQ FSTVfTREEL AKHYNELHYS GFHFRLLGTQ QKQKQLPFYL 
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leqde|rdm| fri|k...im eaylqltasr khadreedvd qawdayeal 
itesk|kkm| yemItafirf qcdcshflkn giekdnegdi qgyfyqlIkw 

ITESk|kKm| YEMITAFIRF QCDCSHFLKN GIEKDNEGDI QGYFYQLJKW 
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K139 rep AHFCTQTFGI KA§....RKY 
HP1 rep CGEIAFSAGF KI0H0EKIEN 
HP2 rep CGEIAFSAGF KI|h|eKIEN 
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KY. FCEDEIA PAVM MFNEV HrGRLRRIA 
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186 A AAWRBlqH VHnIsKKRHA 
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KK QLI 
KV QQF 

kv|qqf 



Figure 25-3 Alignment of the initiation proteins of P2 and some P2-like phages. The amino acids were aligned using the 
CLUSTAL X program (59). Amino acids identical in at least four proteins are indicated by dark gray shading. The common 
motifs for proteins initiating rolling-circle replication are underlined (62, 75). The location of P2 or/ (in the corresponding 
DNA seguence), and the amino acid in phage 186 believed to interact with the REP protein, are indicated by arrowheads. 
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Figure 25-3 Continued. 
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group—based on the arrangement of three conserved motifs 
(62, 75). The A/Rep proteins of these P2-like phages belong 
to the Rep class of initiation proteins. This class has the 
conserved motif 1, of unknown function, closest to the 
N-terminus. Motif 2 is located in the center and it contains 
two invariant His residues that are believed be involved in 
coordination of Mg 2+ or Mn 2+ . Motif 3, which contains the 
catalytic site with the two conserved tyrosine residues, is 
located closest to the C-terminus. A comparison of the initia¬ 
tion proteins of the P2-like phages is shown in figure 25-3 
(72, 92, 136). Apart from the regions around the three 
conserved motifs, the initiation proteins share few sequence 
identities, but the Rep proteins of phages HP1 and HP2 are 
almost identical. It has been shown that up to 101 bp can be 
removed from the C-terminus of the phage P2 A protein 
without any significant effects on its biological activity 
(112). In fact as much as 150 bp can be removed, and the 
protein will still be able to cleave and join single-stranded 
substrates, although with a reduced efficiency (T. Krokeide, 
B. H. Lindqvist and E. Haggard-Ljungquist, unpublished 
data). 

The N-terminal parts of the initiation proteins differ 
extensively, except for those of HP1 and HP2, and are 
believed to be involved in interactions with the host proteins 
required for DNA replication. Both phage P2 and phage 186 
require the host proteins DnaB, DnaE, DnaG and Rep (18,60). 
Genetic evidence indicates a direct interaction between 
the Rep helicase and a glutamic acid residue at position 
155 of the phage 186 A protein (152). Phage 186 also requires 
DnaC, the DnaB helicase loader, in contrast to phage P2 
where the B protein has been shown to interact with DnaB 
and is thus believed to act as a helicase loader (113). However, 
since P2 is able to form a functional minichromosome in 
the absence of protein B, the B protein requirement is only 
valid for phage replication. Phage 186 is unable to replicate 
at restrictive temperature in dnaA-ts mutants of E. coli, but 
this has been shown to be an indirect requirement since 
phage 186 will replicate in integratively suppressed dnaA- 
defective E. coli strains (136,152). 

The phage P2 A protein has another intriguing property, 
namely that it works preferentially in cis. This was initially 
detected as a lack of complementation of A amber mutants 
by simultaneous wild-type infections in phages S13, ®X174, 
G4, and P2 (41, 83, 135, 143, 144). That this is a function of 
the A gene alone has been shown using phage P2 ‘A” 
minichromosomes which contain only the A gene and 
an antibiotic resistance marker (113a). Cells containing a 
wild-type minichromosome cannot be transformed with 
a minichromosome containing an A amber mutation under 
su~ conditions, while this occurs with a high frequency 
under su + conditions. Furthermore, under in vitro condi¬ 
tions, using coupled transcription-translation, the cis pref¬ 
erence of the phage P2 A protein for its own gene is 
maintained (113 a). The mechanism behind the preferred 
cis activity for phage ®X174 has been suggested to be 


immediate membrane entrapment upon synthesis (41,148). 
But this cannot be the case for phage P2 since the same 
preference occurs under in vitro conditions. Purified phage 
P2 A protein is unable to cleave covalently closed circular 
double-stranded DNA or linear double-stranded DNA. 
Thus, the coupled transcription-translation per se or some 
component present in the S30 extract is required for the 
cis-activity. Possibly the A protein acts on its own DNA 
target before it is fully translated. 

Activation of Late Transcription 

P2 late genes are transcribed from four late promoters, 
designated P P , P G , P v and P F , while those of phage 186 are 
transcribed from the three promoters P 12 , P v , and Pj (corre¬ 
sponding to the phage P2 promoters P P , P 0 , and P F , respec¬ 
tively). As can be seen in figure 25-4, the promoters have 
poor similarities to the consensus —10 and —35 regions of 
the E. coli a /0 -dependent promoters. Instead they have a 
region with a partial dyad symmetry centered approximately 
55 bp downstream the transcriptional initiation site. This 
dyad symmetry has been shown by deletion analysis and 
base substitutions to be essential for promoter activity 
(46,147). 

Initiation of late gene transcription requires active DNA 
replication, and a transcriptional activator, designated 
Ogr in P2 and B in 186 (13, 24-26). Ogr and B have been 
shown to be members of a family of small transcriptional 
activators containing a C 2 C 2 zinc finger motif that are 
present in some members of the P2 family of phages: NucC 
from a cryptic prophage in Serratia marcescens (64), Pag from 
the Salmonella phage PSP 3 (66), Ogr from the Pseudomonas 
phage <DCTX (103), and ORF13 from Vibrio cholerae phage 
K139 (72) (figure 25-5). However, no similar protein has been 
found in neither phage HP1 nor phage HP2 of Haemophilus, 
indicating that these phages activate late gene transcription 
by another mechanism (35). The late genes of P2, and most 
likely phagel86, can also be directly activated by the 8 pro¬ 
teins of satellite phages P4 and <DR73 (66). The 8 proteins 
are members of the same family of zinc containing proteins 
as Ogr and B. The 8 protein of phage P4, however, is twice the 
size of the P2 Ogr protein and contains two zinc fingers. It 
appears to be a covalent dimer of Ogr (51, 68) (figure 25-5). 
The binding of the transcriptional activators phage PSP 3 
Pag, phage P4 8 and phage <DR73 8 to the phage P2 late 
promoters, and the B protein binding to the phage 186 Py 
promoter, has been analyzed by DNasel footprinting and 
the results confirm the binding of the activators to the dyad 
symmetry around position — 55 (66,67,117). 

The phage P2 Ogr, phage 186 B, and phage <1>R 73 8 pro¬ 
teins have been purified. The assumption that the conserved 
cysteine residues are involved in forming a complex with 
Zn(II) is supported by 65 Zn blotting and atomic absorption 
spectroscopy analysis (69, 79, 117). Moreover, site-directed 
mutagenesis of the conserved cysteines of phage P2 Ogr 
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Figure 25-4 Comparison of the promoter regions of the late operons in phages P2, 186, and P4. The transcriptional start 
sites are indicated by the arrows. The common sequences are indicated in bold, and their consensus sequences are indicated 
below the promoter sequences and compared with conserved —10 and —35 regions of the ct 70 promoter. 
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Figure 25-5 Alignment of the transcriptional activators for the late promoters of P2 and some P2-like phages. Alignment was 
performed using the CLUSTAL X program (59). The conserved cysteins are indicated by dark-gray shading, amino acids 
common to at least five proteins in light gray. Amino acids believed to be involved in interactions with the RNA polymerase a 
are indicated with arrowheads. 


and phage <1>R73 S proteins also shows that the cysteines 
are required for biological activity (43, 44, 69). Ln fact, even 
replacing the C 2 C 2 motif with a C 2 H 2 zinc binding motif 
in 4>R73 8 leads to a complete loss of zinc binding and 
biological activity (69). 

The Ogr-like transcriptional activators are believed to 
interact with the host RNA polymerase a, since P2 Ogr is 
inactive in E. coli strains with specific mutations in the rpoA 
gene (3, 42,140). By isolating mutants of phages P2, P4 and 
4>R73 able to grow on such E. coli rpoA mutants the amino 
acids equivalent to P2 Ogr residues number 13, 19, 20, 42, 
and 44 have been implicated to interact with the a subunit 
of RNA polymerase (52, 69, 74). These residues are not 
conserved in all Ogr-like proteins (figure 25-5), but they are 
located in regions that are conserved among the members. 
The N-terminal two thirds of the protein is well conserved 
among the Ogr-like proteins, but not the C-terminus. 
This fits with the finding that the 21 C-terminal amino 
acid residues are not required for phage P2 Ogr protein 
activity (44). 


The P2 ogr and 186 B genes are transcribed from two 
different promoters. A normal E. coli a /() promoter, which 
should be active early after infection, precedes both genes. 
Late in infection they are transcribed instead as part of one 
of the late tail operons (14, 71). In phage 186, the expres¬ 
sion of B is controlled by the cl protein, while the P2 Ogr 
protein seems to be under indirect immunity control (14,26). 
Furthermore, in phage 186 it has been shown that activation 
of the late promoter requires only the B protein and that 
DNA replication is not necessary (25). Thus, phage P2’s 
requirement for DNA replication for late-promoter activation 
might be explained by an Ogr-protein dose effect. 

Lysis 

All double-stranded phages studied so far use a holin- 
endolysin system for host cell lysis (154, chapter 10). The 
endolysins can be grouped according to their muralytic 
activities. Transglycosidases and lysozymes attack the gly- 
cosidic bond, and the amidases and endopeptidases attack 
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the amide and peptide bonds (155). In most cases, the holin 
and the endolysin genes are localized in a lysis cassette 
together with genes encoding accessory lysis proteins. Bac¬ 
teriophage P2 and the P2-like phages studied so far have two 
essential lysis genes. First is the endolysin gene designated K, 
P, lys or ORF28 in phages P2,186, HP1 and HP2, and IC139, 
respectively, which is orthologous to the X R gene encoding 
a transglycosylase. The second is the holin gene designated 
Y, ORF24 or hoi in phages P2, 186, and HP1 and HP2 (154, 
160) (table 25-1). The holin of phage HP1, Hoi, differs from 
Y and ORF24 of phages P2 and 186. A holin for phage K139 
has not yet been identified. This might not be surprising 
considering the extensive diversity of holin genes. So far, 
more than 35 unrelated orthologous gene families have 
been identified (150; chapter 10). 

Phage P2 has two ancillary lysis genes, lysA and lysB, 
which are functionally homologous to the Rz and Rzl pro¬ 
teins, although lysA and lysB are two separate but adjacent 
genes while Rzl is embedded within Rz in a different reading 
frame (154, 160). Mutants in lysA cause slightly delayed 
lysis and might encode an antiholin, while mutants in lysB 
show accelerated lysis (160). Genes homologous to lysB are 
also found in phage 186 and phage ®CTX, where they seem 
to play a role in the correct timing of lysis. 

Lysogenization 

All P2-like phages studied so far integrate into the host chro¬ 
mosome upon lysogenization. The integration is mediated by 
a phage-encoded integrase that promotes recombination 
between a phage attachment site ( attP ) and a bacterial 
attachment site (attB), generating host-phage junctions, 
designated attL and attR. This site-specific recombina¬ 
tion leads to no loss or gain of nucleotides. Integration also 
requires the integration host factor, IHF, which acts as an 
architectural protein by bending the DNA. The reverse 
event, excision, requires an additional phage-encoded 
protein. Thus, the P2-like phages use the same mechanism 
for integration as the well-studied X site-specific recombina¬ 
tion system, but the phage proteins and their DNA binding 
sites differ (78). 

Chromosomal Insertion Sites 

The preferred integration sites of phage phages P2,186, and 
W® in the E. coli genome (attB ) differ in sequence and in 
location (table 25-2), which is noteworthy since these inte¬ 
gration sites are recognized by phage-encoded integrases 
that are presumed to have a common ancestry. This suggests 
that there has been selection in favor of different integration 
sites, allowing several P2-like prophages to occupy the same 
host. A strongly preferred attachment site for phage P2 has 
been found only in E. coli C strains. The corresponding sites 
on the maps of E. coli K-12 and B strains are occupied by 


defective P2-like prophages. As a consequence, phage P2 
integrates at many secondary sites, which show up to 37% 
mismatches within the core sequence (6,138). 

The integrases of the P2-like phages analyzed so far are 
all members of the integrase family of recombinases, which 
recognize inverted repeated sequences in the host chromo¬ 
some. The cleavage-joining reaction is believed to occur in 
two steps. First one strand in attP and one in attB are 
cleaved, exchanged, and joined to each other. The second 
cleavage-joining takes place after a short branch migration 
has occurred. The sequence recognized by the integrase, 
the core sequence, is therefore denoted BOB' in the host, 
where B and B' are the sequences recognized by the inte¬ 
grase and O is the intervening region where branch migra¬ 
tion takes place. Among the P2-like phages, the crossover 
points have been determined only for phage HP1, which 
gives a branch migration of 7 bp (50) (table 25-2). Inter¬ 
estingly, the core sequence of phage <I>CTX contains only a 
direct repeat, and in the case of phage P2 the inverted 
repeat constituting the B and B' sequences have a poor iden¬ 
tity. By mutational analysis the right half. B', has been 
shown to be the primary recognition sequence (C. Frumerie, 
A. Yu and E. Haggard-Ljungquist, unpublished data). 

Many phages integrate into tRNA genes. To avoid disrup¬ 
tion of the gene, the phage contains a long identity region so 
that part of the tRNA is duplicated upon integration and a 
complete copy of the gene is maintained. It has been 
suggested that the primordial attB sites were tRNA genes 
(23). As can be seen in table 25-2, phages P2 and W® do not 
integrate into tRNA genes. Instead, they have short identity 
regions located in spacer regions between E. coli genes, and 
phage K139 integrates between the flaC gene and the gene 
encoding the flagellin core protein, A, in V. cholerae. 

Structure of the Phage Attachment Sites 

The site-specific recombination systems of phages P2, 186, 
W® and HP1 are very similar to the X system, where the 
integrase and the host integration factor (IHF) assemble at 
the phage attachment site, attP, forming the recombinogenic 
intasome that unites into a synaptic complex with the host 
attB site (78). Like X integrase, the integrases of phages P2, 
186, W®, and HP1 also have two different DNA binding 
epitopes, one recognizing the core DNA sequence and the 
other the arm sequences (36,87,158). The reverse event, exci¬ 
sion, requires an additional phage-encoded protein that in X 
is designated Xis. In phages P2, 186, W®, and HP1, the 
equivalent function is provided by the repressor of the 
lysogenic promoter: the Cox protein in phage P2, in phage 
W® and in phage HP1, and the Apl protein in phage 186 
(30, 36, 37, 87, 159). However, as can be seen in figure- 
25.6, there are differences in the size of the attP regions and 
in the number and location of the binding sites of the 
proteins required for recombination in the respective phage. 
Phage HP1 has the largest attP site, spanning about 400 bp, 
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Table 25-2 Host Integration Sites of P2-Related Phages 


Phage 

Host 

Integration Site 
(kb from or/) 

Cene(s) or ORF(s) 

AttB Sequence 3 

Reference/ 
Accession No. 

P2 

E. coli K-12 

2 165.2 

Between yegQ and b2083 

AAAAAAT AAGCCCCTCT AAGGCACATT 

156 

P2 

E. coli 0157:E7 

2 842.6 

Between Ec2889 and Ec2890 

AAAAAAT AAGCCCCTCT AAGGCACATT 

54 

P2 

E. coli 0157:E7 EDL933 

2 912.8 

Between Z3250 (yegQ) and Z3251 

AAAAAAT AAGCCCCTCT AAGGGAGATT 

113 

WO 

E. coli K-12 

4 103.9 

Between cpxR and pfkA 

GACACCAT CCCT CT CTT CCCCCACAT CCT CT CCCCGTTTTTTTT AT C 

86 

186 

E. coli K-12 

2 783.8 and 

Within ileY and ileX tRNA' le 

TGCT GGACTT G AACCAGCG ACCAAGCGATT AT GAGT 

U32222, 



3 213.3 



NC_001317 

HP1 

Haemophilus influenzae 

91.8 and 139.6 

Within tRNA leu 

AGGGA J, TTTTAAA X TCCCTT 

52, 53 

4>CTX 

Pseudomonas aeruginosa 

2 947.6 

Within tRNA ser 

ATATGGCGGAGGCGG TGAGATTCGAACTC 

55 

K139 

Vibrio cholerae 

2 334.4 

Between FlaC and Flagellin 

CAG AAAAG G G G CTTTT CTTTTTT C b 

104 



(chromosome 1) 

core protein A 




a The cleavage sites in HP1 are indicated by arrows. Inverted repeats are underlined with a single line, direct repeats are underlined with a double line. 
b Not verified experimentally. 
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Figure 25-6 A schematic drawing of the attP regions of phage P2 and some P2-like phages. Shown are the relative binding 
sites of proteins required for integration and excision. In phage HP1, the dark gray indicate integrase binding sites required 
for recombination, those in light gray have stimulatory function, and white binding sites are nonessential. See text for more 
details. 


whereas the equivalent site in phage P2 is about half that 
size. Compared with phage P2, phage HP1 has additional 
integrase binding sites (38, 49, 158). The C-terminal DNA 
binding epitope, which recognizes the core sequence, has 
two additional binding sites on either side of the core. The 
left site is dispensable and the right site is required for exci¬ 
sion but not for integration. The phage HP1 integrase also 
has an extra arm-binding site, about 250 bp to the left of the 
core, which is stimulatory. It should be noted that this site 
has an inverted repeat, in contrast to the other two arm¬ 
binding sites which have direct repeats. Both phages HP1 
and 186 have two IHF binding sites, one on each side of the 
core, while phage P2 only has one site, which is to the left of 
the core (28,61,158). 

All three phages have the Cox/Apl binding site located to 
the right of the core sequence, but they differ in both 
binding-site number and orientations. Phage HP1 has 
only two Cox-binding sites, located as a directly repeated 
sequence of 10 bp, spaced by lbp, and one Cox tetramer 
binds to each repeat (37). The phage P2 Cox protein has six 
repeated sequences of 9 bp, oriented so that they can form a 
long inverted repeat with three repeats in each arm (159). 
In phage 186, the Apl protein recognizes five repeated 
sequences of 6 bp, all in the same orientation and located 
so that they will be exposed on the same face of the DNA 
helix (30). Phage W<D has the integrase and IHF binding 
sites located at similar positions compared with phage P2, 
but the core sequences differ while the arm sequences 
are identical (87) (table 25-2). The location of the phage W® 
Cox protein binding sites has not been determined. 


Phage Proteins Involved in Site-Specific 
Recombination 

The alignment of 105 site-specific recombinases of the 
integrase protein family has revealed two conserved boxes 
and three conserved patches of charged amino acids (110) 
(figure 25-7). Box I contains the invariant Arg residue, and 
Box II contains the conserved His-Xxx-Xxx-Arg motif, and 
the active site tyrosine, which in phage ®CTX is shifted 3 
amino acids compared with the others. The phage X 
integrase, the prototype for genetic and biochemical studies 
of the integrase family of recombinases, has at least two 
domains: a small N-terminal domain that has a high affinity 
for the arm-type sites in attP, and a large C-terminal domain 
that contains both the low affinity core-binding site and the 
catalytic site. The DNA bending protein, IHF, functions by 
binding to the region between the arm and the core, produ¬ 
cing a U-turn that brings the integrase binding sites into 
close proximity. This allows the integrase bound to the 
high-affinity arm sites to bind to the low-affinity core site 
(73, 122). The structures of the catalytic domains of the 
phage X and phage HP1 integrases have been determined, 
and they contain a globular domain composed of a bundle 
of a-helices with a three- or four-stranded antiparallel 
(3-sheet on the outside (21, 76). In the monomer structure of 
phage HP1 integrase, the 17-residue-long C-terminal tail 
extends away from the globular domain, and the dimers 
observed in the crystal are formed by interactions of the 
C-terminal tails that orient the active-site clefts antiparallel 
to each other (21). Both phage HP1 and phage X integrases 
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Figure 25-7 Alignment of integrase proteins of phage P2 and some P2-like phages. The amino acids were aligned using the 
CLUSTAL X program (59). Amino acids present in at least four proteins are indicated by gray shading. The conserved boxes 
and patches among the integrase family of proteins are underlined (110). 


have been shown to form dimers and oligomers in solution 
(49). However, in the case of A, integrase, it seems to be 
the N-terminus that is involved in protein-protein interac¬ 
tions (63). Using genetic and biochemical analyses, phage 


P2 integrase has also been shown to form dimers and in 
this case the C-terminus as well as central parts seem 
to be involved in protein-protein interactions (C. Frumerie, 
J. M. Eriksson, M. Dugast, and E. Haggard-Ljungquist. 
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Figure 25-8 Alignments of amino acid sequences. Alignments were performed using the CLUSTAL X program (59). A: Cox 
proteins. The location of the helix-turn-helix structure believed to be involved in DNA binding is indicated. The location of a 
P2 cox mutant, defective in dimerization, is indicated by a star. Amino acids present in at least four proteins are indicated by 
gray shading. B: Immunity C proteins of P2 and P2-like phages. The two a-helices presumed to contain the DNA binding 
motif are indicated. The mutations discussed in the text are indicated above the sequence. Amino acids common to all three 
proteins are shaded in dark gray, those present in two are shaded in light gray. C: Immunity cl proteins of 186 and the 186- 
like proteins. Amino acids common to at least four proteins are shaded in dark gray, those present in three are shaded in 
light gray. The location of the two a-helices, involved in DNA binding, are indicated. 


unpublished data). When the integrases of the P2-like phages 
are aligned they show some homology outside the conserved 
boxes and patches, even though very few amino acids are 
conserved in all proteins. It should be noted that the inte¬ 
grases of HP1 and HP2 are almost identical (figure 25-7). 

As noted above, in the P2-like phages the repressors of the 
lysogenic promoters also have an architectural role during 
excisive recombination. However, in the case of P2 and 
HP1, the Cox proteins are also involved in regulating the 
direction of recombination since they are not only required 
for excisive recombination but also inhibit the integrative 
recombination (37, 159). Binding of the Cox protein to attP 
in phage HP1 has been shown to prevent binding of integrase 
to the right arm site, which explains its inhibitory effect 
on integrative recombination. A similar analysis has not 
been performed with the phage P2 Cox protein, but P2 Cox 
binding has been shown to bend the DNA target about 72° 
(J. M. Eriksson and E. Haggard-Ljungquist, unpublished 
data). This bending might affect binding of Int protein to the 
P' arm since the footprints of the Cox and Int proteins 
are slightly overlapping (158). The DNA-binding domains of 
the Cox/Apl proteins are believed to be a helix-turn-helix 
motif located at the N-terminal end of the respective protein 
(29,37,129). This region is also the most conserved part of the 


proteins (figure 25-8). The native forms of the Cox proteins 
of phages HP1 and P2 are tetramers that can self-associate 
to octamers, while protein Apl of phage 186 is a monomer 
in solution (34, 37, 134). It has been shown with phage P2 
Cox protein that the protein-protein interacting interface is 
located in the C-terminal region and that oligomerization is 
necessary for biological activity (34). 

The fact that the integrase and the cox genes are located 
in two different, mutually exclusive transcriptional units— 
the lysogenic and the early transcript, respectively—poses 
a problem since they both are needed for excision of the 
prophage after derepression. Thus, the spontaneous lysogen 
induction is low, but measurable (7). However, derepression 
of a P2 lysogen with a temperature-sensitive repressor is 
abortive: less than 1% of the bacteria will produce phage 
(10). This has been shown to be due to insufficient produc¬ 
tion of integrase from the prophage (93). The level of expres¬ 
sion of P2 integrase is affected at several steps. First, the Pc 
promoter is rather weak and under negative control by the 
Cox protein. Secondly, there is a partial transcriptional 
terminator, located between gene C and int, that allows 
only 30% read-through. Thirdly, the final transcriptional 
terminator is located downstream of attP. This means 
that the C-int transcript lacks a terminator signal after 
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integration and will continue into an untranslated region in 
the host chromosome, which might make the transcript 
accessible for RNase attacks. Finally, the integrase functions 
as a translational repressor by binding to its own transcript 
covering the ribosomal binding site (156). 

A Model for Intasome Formation 

The HP1 site-specific recombination system has been exten¬ 
sively analyzed. By mutating the different DNA binding sites 
of the proteins involved in recombination, a tentative model 
for intasome formation has been generated (38). The basic 
steps for integrative recombination are: (i) The C-terminal 
ends of two integrase molecules will bind to the core 
sequence (site 4 in figure 25-6), (ii) the N-terminus of each 
core binding integrase molecule binds to one of the repeats 
in the respective arm (site 2 and site 5) in a process that is 
stimulated by binding of IHF, (iii) two other integrase mole¬ 
cules will bind to the free arm repeats with their N-terminal 
ends at sites 2 and 5, and (iv) their C-terminus will bind to 
the attB core sequence allowing recombination. 

Control of Lytic Versus Lysogenic Growth 

The developmental switch of temperate phages must be set 
so that the phage after infection enters either the lytic cycle 
or the lysogenic cycle. The two pathways must be mutually 
exclusive. There are three common characteristics of the 
transcriptional switches of the P2-like phages: (i) two 
face-to-face promoters that control the lytic or lysogenic 
functions with partially overlapping transcripts: (ii) two 
repressors, the immunity repressor and the Cox/Apl repres¬ 
sors, which recognize different operators and control each 
others promoter; and (iii) the Cox/Apl repressors also act as 
excisionases. However, the P2-like phages have two types of 
immunity repressors, based on size, composition, and struc¬ 
ture of the target. The establishment and maintenance of 
immunity also differs among members of the group, since 
some have a clf-like function and some respond to the host 
SOS function. A schematic drawing of the switches of 
phages P2 and 186 are shown in figure 25-9. 

The Transcriptional Switch 

Even if the transcriptional switches of the P2-like phages 
have similar arrangements, several differences are evident. 
Enlarged representations of the promoter-operator regions 
are shown in figure 25-10. Phage HP1 is unique since it has 
two early promoters, P R i and P R2 , but as P R i is relatively 
weak it has been suggested to play only a minor role in vivo 
(39). The two face-to-face promoters that control the 
switches give transcripts that overlap for different lengths 
for the different phages. The phage P2 transcripts have 
the shortest overlap of about 35 bp (A. Ahlgren-Berg and 
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Figure 25-9 A schematic drawing of the transcriptional 
switches. A: Phage P2. B: Phage 186. See text for more 
details. 

E. Haggard-Ljungquist, unpublished data). Depending on 
which promoter is used in phage HP1, the transcripts over¬ 
lap with 44 or 72 bp (39). fn phage 186 the overlap is about 
60 bp; and other phages in the group have overlaps of 
72-75 bp (29,87,105,120). 

The early promoters are much stronger than the lyso¬ 
genic promoters for all phages except HP1, where the two 
promoters are of similar strength (29, 39, 87, 120, 130). In 
all phages the strength of the lysogenic promoter is signifi¬ 
cantly reduced (10- to 20-fold) by the presence of the lytic 
promoter. In phages P2 and 186 the interference of the 
early transcript in the activity of the lysogenic transcript 
can be abolished by addition of immunity repressor, but 
this has not been found in HP1 (29, 39, 87,120,130). 

As can be seen in figure 25-8, the immunity proteins of 
the P2-like phages can be divided into two types based on 
size and sequence similarity: the P2-like and the 186-like 
proteins. The three C proteins of the P2 type are small, 
homologous proteins. The native form of phage P2 C protein 
is a dimer, and the C-terminal part is believed to be invol¬ 
ved in dimerization (89, 121). The C-terminal ends are very 
similar, and the C proteins of phages P2 and W® can form 
heterodimers that are able to repress a hybrid operator 
containing both phage W® 01 and phage P2 02 operators, 
which neither protein can accomplish alone (P. Peltola 
and E. Haggard-Ljungquist, unpublished data). Thus, two 
operators and dimerization of the C protein are required for 
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Figure 25-10 The nucleotide sequences of the promoter-operator regions of phage P2 and some phage P2-like phages. The —10 and —35 regions are underlined, and 
the start of the transcripts are indicated with bent arrows. The operators for the immunity repressors are indicated by light gray shading, and those of the Cox/Apl 
repressors in dark gray shading. The arrows above the shaded operators indicate the orientation of the respective recognition sequence for each protein. The start 
codons of the genes are indicated. 
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biological activity. The N-terminal ends of the C proteins 
contain two predicted a-helices believed to be involved in 
DNA binding. The phage P2 c5 and the phage P2 Ely dis 
virl4 mutations, which are located in the first a-helix, 
change the conserved glutamic acid residue at position 15 
to a valine and a lysine, respectively (120). Another mutation 
in phage P2, c8, is located in the second a-helix that is less 
conserved, as is expected if it constitutes the DNA recogni¬ 
tion helix. The c8 mutation has been shown not to affect 
dimerization, which fits with the hypothesis that the 
N-terminus constitutes the DNA binding motif, and that 
the dimerization interface is located in the C-terminus (121). 

The immunity repressors of the P2 type recognize direct 
repeats of different lengths, spanning either the —10 or the 
— 35 region of the respective promoter. Also the distances 
between the operators differ. In phages P2 and W® they are 
located on the same face of the DNA helix, since the distance 
from center to center of the repeats is 22 and 33 bp respec¬ 
tively (87, 94). In phage P2 Hy dis, however, the distance is 
26 bp, and they will therefore not be on the same face of the 
DNA (120). 

The cl immunity repressors of the phage 186 type are 
about twice the size of the C repressors of the phage P2 
type, and the two subgroups show no sequence similarity 
(37,70,105) (figure 25-8). A recent search for proteins homo¬ 
logous to phage 186 cl repressor has identified several other 
phages/prophages, in different hosts, containing related 
proteins (133), but they have not yet been analyzed for other 
properties in common with the P2-like phages and are thus 
not discussed here. The cl repressors of phages 186 and HP1 
have been shown to recognize inverted repeat sequences. 
The cl protein of phage 186 has been purified and the native 
form was shown to consist of dimers that self-associate via 
tetramers to octamers (134). The protein has further been 
shown to consist of two domains, where the N-terminal 
domain contains the DNA-binding motif and the C-terminal 
domain contains the oligomerization surface (27, 133). 
Besides controlling the early lytic promoter P R , the phage 
186 cl repressor controls the expression of the late promoter 
activator protein B and the expression of ell. In addition, 
a third binding site within the cl gene has been 
identified (27). Phage 186 cl repressor has three binding 
sites within P R : one to the left of the —35 region, one 
between the —35 region and the —10 region, and a third to 
the right of the —10 region. Each operator is repeated at a 
distance of two turns of the helix (27, 77) (figure 25-10). 
It should be noted that the central operator of phage 186 
(OI) has a consensus sequence that differs from that of 
the adjecent operators (Oil and OIII), and by mutational 
analysis it has been shown that the a-helices 2 and 3 at the 
N-terminus, which constitute a weak helix-turn-helix 
motif, recognize both operator sequences (27, 133). Phage 
HP1 has two predicted operators spanning the —10 region 
of promoter P R2 (39). The Cl proteins of phages 186, HP1, 
HP2, and K139 show similarities in size and amino acid 


sequence, but the number of conserved amino acids are few, 
and most are located in the C-terminal part of the proteins 
(35, 70, 105) (figure 25-8). The cl proteins of HP1 and HP2, 
however, are almost identical. 

The repressors that control the lysogenic promoter of the 
respective switch are the multifunctional Cox/Apl proteins, 
which also act as architectural proteins during excisive 
recombination. The Cox/Apl proteins recognize a different 
DNA sequence compared with the immunity proteins. For 
phage P2, the Cox recognition sequences overlap with the 
Pc promoter. For all other P2-like phages, they are located 
downstream of the —10 region (figure 25-10). Thus, with 
phage P2 the Cox protein may block RNA polymerase 
binding, but in the other P2-like phages the Cox/Apl proteins 
may affect steps later than RNA polymerase binding such 
as open-complex formation or promoter clearance. The 
Cox proteins, with the exception of phage HP1 Cox, also 
autoregulate their own expression (37, 87,129), but this does 
not appear to be of biological significance since Cox mutants 
show normal lytic growth. The P2 Cox protein has been 
shown to bend its DNA target upon binding, and if this is also 
true for the other Cox/Apl proteins then it can explain the 
effect on both promoters Q. M. Eriksson and E. Haggard- 
Ljungquist, unpublished data). 

The Cox proteins of phages P2, P2 Hy dis, and W® are 
more similar to each other than to the Cox/Apl proteins 
of phages 186, HP1, HP2, and K139 (figure 25-8). The 
most conserved region is the helix-turn-helix motif in the 
N-terminal part of the proteins, believed to be involved in 
DNA binding. It is noteworthy that the HP1 and HP2 Cox 
proteins are similar only at the N- and C-termini, while the 
interiors of the two proteins show no similarity. This is in 
contrast to the cl, Int, and A proteins, which are almost 
identical in phages HP1 and HP2. 

Satellite phage P4, which needs all P2 late functions for 
lytic growth (see chapter 26), needs a signal to recognize 
the presence of a helper phage. When phage P2 infects a 
P4-lysogenic strain, the P2 Cox protein is used as a tran¬ 
scriptional activator that turns on the phage P4 Pll promoter 
that controls P4 DNA replication (128). The P4 Pll promoter 
is normally activated by the P4 8 gene product (139). The 
phage P2 Cox protein has no similarity to phage P4 8 
protein, but acts by mimicking the way 8 bypasses the 
normal P4 immunity, that is by binding to an extended 
region upstream of the Pll promoter. This region contains up 
to six Cox recognition sequences that are all positioned in 
the same direction (128). The Cox protein of phage P2 Hy dis 
is also able to activate the phage P4 Pll promoter, in contrast 
to the phage W® Cox protein that is unable to do so (87,120). 

Establishment of Lysogeny 

Phage 186 has been found to have an additional gene 
(ell) that controls the establishment but not maintenance of 
lysogeny. A homologous gene is present at the same location 
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in phage K139 (72), but the function of the ell protein has 
been studied only in phage 186. The phage 186 ell gene is 
located in the early operon, downstream of gene apl, and 
encodes a transcriptional activator that acts on the P E 
promoter located between genes apl and ell (77, 106, 107) 
(figure 25-9). The P E promoter, which is negatively controlled 
by the cl protein, has in the presence of the ell protein about 
the same strength as the P R promoter and it increases 
transcription of cl at least 55 times (106, 107). The P E 
promoter, as opposed to the P L promoter, is not affected by 
the Apl protein (107). The ell protein recognizes an inverted 
repeat sequence, separated by two helical turns of the 
DNA helix, that is located upstream of the P E promoter. 
The ell protein has a predicted helix-turn-helix motif at 
the N-terminal end, believed to be the DNA binding site. 

The control of establishment of lysogeny of phage 186 is 
very similar to that of phage X, since they both have a ell 
protein that functions as an activator of a promoter located 
upstream of the weak P L promoter that enhances transcrip¬ 
tion of the cl protein. Even though the ell proteins have a 
similar function in these phages, they have no detectable 
amino acid sequence homology and the transcriptional 
switches controlling the maintenance of these phages are 
very different (118). 

The fact that phages 186 and P2 have similar lysogeniza- 
tion frequencies, even though P2 lacks a ell protein, is diffi¬ 
cult to explain. With phage P2 the lytic promoter is at least 
100 times stronger than the lysogenic promoter, while in 
phage 186 the cll-activated P E promoter makes the opposing 
promoters almost equal in strength. Thus, other factors 
involved in the control of these molecular switches will 
most likely be revealed in the future. 

Induction 

Bacteriophage P2 has been the prototype for the non¬ 
inducible class of temperate phages: indeed, its immunity 
repressor protein lacks the sequence present in proteins 
that are induced to self-cleavage by the activated RecA 
protein. But phage 186 is UV-inducible, even though its 
immunity repressor is insensitive to the activated RecA (71). 
Instead, the induction of the 186 prophage is dependent on a 
phage-encoded protein designated Turn. The turn gene is 
located at the left end of the genome and it is transcribed 
from the P 95 promoter together with the open reading 
frame ORF97. The P 95 promoter is controlled by the LexA 
protein, which is sensitive to the activated RecA protein 
(19). The Turn protein has been purified and shown to be an 
antirepressor that binds to the cl protein, preventing it from 
binding to the operator (132). 

Phage P2 and its satellite phage, P4, have the capacity to 
mutually derepress each other. As described above, derepres¬ 
sion of a P4 lysogen by phage P2 is mediated by the P2 
Cox protein that acts as a transcriptional activator of the 
late P4 promoter, Pll. When phage P4 infects a P2 lysogen, 


it needs to derepress prophage P2 in order to gain access 
to the P2 late genes. This is accomplished by the P4 e gene, 
which promotes derepression of prophage P2 leading to 
in situ phage-P2 DNA replication and activation of late 
genes (45). The E protein has been shown to act as an 
antirepressor since it binds to the C protein, but not to DNA, 
leading to a shift from the lysogenic to the lytic mode ( 88 ). 
Like the C protein, the E protein forms homodimers but not 
multimers. However, in the presence of both proteins, multi¬ 
meric complexes of E and C are formed that interfere with 
the binding of C to its operator (89). The phage P2 sos 
mutant grows normally, but in the prophage state it is no 
longer derepressed by the P4 E protein. P2 sos will, however, 
support P4 growth in a mixed infection (5). The sos mutation 
is located within the coding part of the C gene where it 
changes threonine at position 67 to isoleucine, and the E 
protein has been shown to be unable to bind to the mutated 
C protein (121). The C proteins of phages W<J> and P2 Hy dis 
also respond to the E protein, which fits with the fact that 
the threonine at position 67 is located in a conserved region 
of the C proteins (figure 25-8). The cl proteins of phages 186, 
HP1, HP2, and K139 lack this protein domain, and are not 
derepressed by P4 E. This explains why prophage 186 is 
not derepressed upon a P4 infection (131). 

Evolution of P2-Like Phages 

Phage genomes in general can be characterized as mosaics 
containing both genes similar to other phage genes, or host 
genes, and genes showing no similarity to any known gene 
(15, 58). P2-like phages are no exception: they are similar in 
many aspects but all contain unique genes, some with 
unknown function. A phage is taxonomically classified as 
P2-like if it shares some, but necessarily not all, characters 
with phage P2 (1). Many phages have been found to comply 
with this criterion but there are only six complete genomes 
available (table 25-1). 

Phylogenetic Relationships 

A genome comparison of these six fully sequenced P2-like 
phages reveals that it is only an integrase gene and nine late 
genes (corresponding to genes Q, P, 0, N, M, L, S, H, and T 
in P2) that are both genetically similar and present in all 
six genomes. Separate phylogenetic analysis of the amino 
acid sequences of the proteins encoded by the nine late 
genes all result in the same tree, which implies that they 
share the same evolutionary history (figure 25-11). The 
closer two phages are positioned in the phylogenetic tree 
based on all nine genes, the more similar the genes they 
share in the rest of the genome, for example phage P 2 
compared with phage 186 (table 25-1). There is no indication 
of major recombination events between these nine homolo¬ 
gous genes for any pair of phages, which suggests that it is 
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Figure 25-11 Phylogeny of P2 and various P2-like phages. 
Unrooted phylogenetic tree showing the relationship 
between inferred amino acid sequences from nine late 
genes, homologous to phage P2 genes Q, P, 0, N, M, L, S, H, 
and T. The tree was constructed with maximum-parsimony 
criteria and was the shortest tree found using the exhaustive 
search procedure. Numbers above branches indicate branch 
length (total 7549; the next shortest tree had a total 
length of 7732). The total number of characters was 4281 
and there were 1913 parsimoniously informative characters. 
The tree was constructed with the program PAUP, version 
4.0b8a (141). 

likely these genes are inherited clonally (A. S. Nilsson, 
unpublished data). 

The phylogenetic relationship of other genes, particularly 
the early regulatory genes, is often ambiguous and their 
evolutionary history cannot be resolved. Phylogenetic 
analyses of the amino acid sequences of the six integrase 
proteins (figure 25-8) produce very weak trees in which 
phages E1P1/E1P2 and 186 harbor the most similar int genes. 
Some informative characters also speak for a close relation¬ 
ship between the integrases of phages K139 and 186, as well 
as between phages K139 and P2. The relationship of the int 
gene of phage <DCTX to the other integrases is impossible to 
establish. Phylogenetic analyses of the amino acid sequence 
of the A protein show that phages P2 and 186 are closely 
related in the central conserved part of the protein and 
that phage K139 is related to phages HP1 and HP2, but 
these relationships shift more than once in the last third of 
the protein (A. S. Nilsson, unpublished data) (figure 25-8). 
Although the number of informative characters is rather 
low, recombination is a more plausible explanation for these 
homoplasies than convergent evolution or recurring neutral 
mutations. 

Homologous and Non-Homologous 

Recombination 

Homologous recombination has been shown to be a more 
important mechanism than mutation for nucleotide change 


in phage P2, which is not surprising since about 30% of 
the E. coli strains contain P2-like prophages and exchange 
of genetic information is known to occur between host 
genomes (40). A study of five late genes from 18 closely 
related P2-like phage isolates demonstrated that homo¬ 
logous recombination is extensive and happens at random 
breakpoints (108). The amount of genetic variation was 
low in these genes. Pairwise comparisons of the nucleo¬ 
tide sequence of the most differentiated gene all showed 
over 96% identity. Since there was disproportionately 
more variation in synonymous rather than nonsynonymous 
third-codon positions, it was suggested that these late 
genes are subjected to rather strong stabilizing selection. 

The fact that homologous recombination is detectable 
between slightly differentiated late genes but not between 
more different late genes, could mean that recombina¬ 
tion does not occur in the latter because different capsids 
belong to evolutionarily separate clones with different 
life histories, for example different host preference, and 
hardly ever meet. Another explanation is that recombina¬ 
tion does occur but that the capsid proteins are too different, 
leading to assembly of nonfunctional capsids which are 
quickly eliminated by selection, indeed, the amino acid 
sequences of the capsid genes, for instance those of phages 
P2 and <1>CTX, are very different and few are informative. 
Hence, a third explanation is that minor recombination 
events may pass unnoticed, which leads to the wrong 
conclusion (A. S. Nilsson, unpublished data). 

It is also known that recombination can occur between 
two phage P2 mutants in a mixed infection, although at an 
extremely low frequency and in a recA-independent manner. 
The frequency of ordinary, recA-dependent homologous 
recombination, either between two prophages or by non¬ 
replicating P2 phages infecting a lysogen, is much higher 
and seems to be somewhat more frequent around the phage 
P2 origin of replication (9). 

Phage evolution is not only dependent on homologous 
recombination between related phages. Many similar genes, 
or parts of genes, are frequently found in otherwise unre¬ 
lated phages. A comparison of tail fiber genes from P2, PI, 
Mu, A, IG3, andT2 demonstrates regions of similarity, which 
indicates a high level of non-homologous recombination 
between unrelated phages (47). For example, the protein 
assumed to be involved in tail fiber assembly, G, is virtually 
identical (93%) in phages P2 and Mu (protein U), but the two 
phages share only around 43% of their protein-G amino 
acids with ORF45, the corresponding protein in phage 186, 
which is supposed to be the closest relative of P2. 

Morons 

A phage genome often contains nonessential genes that are 
missing in close relatives. Such genes often differ in AT/GC 
ratio when compared with the rest of the genome and they 
are seldom transcriptionally coupled to other genes but 
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instead carry their own promoters and intrinsic tran¬ 
scriptional terminators. These are traits indicating that 
these genes are recent additions to the genome capable of 
being autonomously transcribed (57). These phage genes 
are sometime referred to as morons (65), and there are at 
least three genes—Z /fun, old —and tin, that belong to this 
class in the P2 genome. Morons that are transcribed and 
translated from an integrated prophage (that is, lysogenic 
conversion) may affect the fitness of the host. This seems to 
apply to the three P2 genes since they all affect E. coli and 
make the lysogen refractory to T5, X, and T-even phages, 
respectively (22,101,102). 

Many of the P2-like phages in E. coli isolates of the 
ECOR collection have different inserts at the locus corres¬ 
ponding to P2 Z /fun. Amplification of the region between 
the G and Ej genes, using only one set of primers, gave 
12 fragments of which nine were of different length 
(1.2-3.6kb). Sequencing of some of these fragments has 
shown that the genetic variation is high. There seem to be 
only two pairs of prophages having the same insert, and 
many contain easily identified open reading frames with a 
high A-T content. Some of these morons have been shown 
to carry their own promoters and their expression has been 
confirmed in coupled transcription-translation assays, but 
the functions of these genes are not known. The morons are 
inserted at the same position, in the spacer region between 
genes G and F lt creating sharp boundaries between similar 
and unique sequences in an alignment. Thus the mecha¬ 
nism behind the insertions appears to be site-specific (109). 
Phages 186 and PSP 3 contain nothing but a short noncod¬ 
ing sequence between the genes equivalent to G and E h 
but phage W® has a 1.4 kb insert that is 61% AT. Phage 
W<J> is very similar to phage P2 in many respects but it 
has a putative methylase gene at the same position as the 
old and tin morons in phage P2 (G. E. Christie, personal 
communication). 

Note Added in Proof 
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Dodd, I. B., and J. B. Egan. 2002. Action at a distance in 
Cl repressor regulation of the bacteriophage 186 genetic 
switch. Mol. Microbiol. 45:697-710. 

Phage P2 

Christie, G. E., D. L. Anders, V McAlister, T. S. Goodwin, 
B. Julien, and R. Calendar. 2003. Identification of upstream 
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late promoter. J. Bacteriol. 185:4609-4614. 
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P4, the Satellite Phage, the Plasmid 

In the early 1960s Erich Six isolated a temperate bacterio¬ 
phage, P4, that formed plaques only on Escherichia coli 
strains lysogenic for P2 or a P2-related phage (90, 91). 
Before long, it was established that P4 depended on the P2 
helper phage for all morphopoietic and lysis “late" functions, 
and that the head and tail proteins of P4 virions were 
encoded by P2 (51, 92). Interestingly, the smaller P4 
genome, encapsidated into a smaller head (figure 26-1), did 
not bear any substantial similarity to P2 DNA and was inde¬ 
pendent of the helper for DNA replication and host lysogen- 
ization (68, 91, 92). The functional and structural 
unrelatedness of the two replicons clearly indicated that P4 
was not a defective P2 but an independent entity and the 
prototype of “satellite viruses," genetic elements that exploit 
unrelated viruses for viral propagation. 

Later it emerged that, in the absence of the helper 
genome, P4 could be maintained not only as an integrated 
prophage but also as a multicopy plasmid (28, 44, 68, 75). 
Because of its diverse modes of propagation (see figure 26-2), 
P4 may be considered as an integrative plasmid that has 
evolved the potential for horizontal transfer by a very specia¬ 
lized phage-mediated transduction mechanism. Like P4, 
other bacteriophages such as the filamentous phages and 
the temperate phages PI and N15, may be stably mainta¬ 
ined as autonomously replicating plasmids in the bacterial 
host. These natural phasmids (phage-plasmid) make the 
boundary between plasmids and phages less definite and 
contribute to our understanding of viruses as part of a large 
pool of mobile genetic information that can be exchanged 
among living cells. 

P4 is no longer a unique example of satellite phages: 
in addition to retronphage 4>R73, a P4-like phage isolated 
from E. coli (52, 98), a satellite plasmid pSSVx/helper-phage 
SSV2 pair, functionally resembling the P4/P2 system, has 
been discovered in the archaeon Sulfolobus (4). 


The Genome 

A P4 virion is made of a linear, double-stranded, cohesive- 
ended DNA molecule 11.6 kb long that is encapsidated into 
a tailed, icosahedral protein head (figure 26-1). The genome 
organization of P4 is diagrammed in figure 26-3A and the 
known genes and sites are detailed in table 26-1. Upon infec¬ 
tion of the E. coli host, P4 DNA circularizes through its 
cohesive ends. The genetic vegetative map of P4 is circular 
(8, 27) and reflects the physical structure of the replicating 
DNA molecule, whereas the integrated prophage genome is 
a circular permutation of the mature DNA (17a). 

All the functions required for lysogenic, lytic, and plasmid 
development are located in the right 80% of the genome 
and include the origin of replication, oril (63), the two main 
a and sid operons which are transcribed divergently from 
the oril site (26, 30, 50), the prophage integration att site 
(17a, 78), and the int (integrase) gene in a monocistronic 
operon located to the left of att. 

As in many episomal elements, a “nonessential region” 
that carries functions not related to the P4 life cycle ( gop , 
|3, cJl; table 26-1) is found adjacent to the att—int integration 
module (17a, 41). These genes, organized in the two consti- 
tutively expressed operons to the left of att-int, may be 
deleted without affecting lysogenic, lytic, or plasmid devel¬ 
opment (41, 58). 

Phages P4 and <DR73 exhibit more than 95% sequence 
identity in the essential region between err and the right 
cos site, with the exception of gene 8 (31%) and the immu¬ 
nity region (85%; see below). On the contrary, in phage 
<DR73 the nonessential region to the left of att-int is 
largely deleted. To the right of the integration module 
<1>R73 carries a retron, a genetic element encoding reverse 
transcriptase that is responsible for the synthesis of a pecu¬ 
liar single-stranded DNA-RNA chimeric molecule called 
msDNA (52, 98). The divergent genetic organization of the 
nonessential regions suggests that phages P4 and <J>R 73 
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P2 (T=71 

Figure 26-1 Structure and morphogenesis of satellite bacteriophage P4 and its helper P2 virions. A: P4 (top) and P2 (bottom) 
virions. Negative stain with phosphotungstate. The tail structures are about 135 nm long. The P4 and P2 heads are, 
respectively, 45 nm and 62 nm in diameter. Electron micrograph by Robley C. Williams, University of California, Berkeley. 
From (43), with permission. B: Structure of the P4 procapsid with the Sid external scaffold (dark ribbon). Reconstruction of 
P4 procapsids at 2.6 nm resolution. The procapsids, approximately 40 nm in diameter, were obtained by coexpressing in vivo 
P2 gpN and P4 Sid. From (35), with permission. C: Comparison of the assembly pathways of P4 and P2 capsids. A indicates a 
cleaved protein. Redrawn from (34), with permission. 


P4 



LYTIC CYCLE: 

P4 replication 
P2 derepression 
P4 late gene expression 
P2 late gene trans-activation 

I 


LYSOGENIZATION: 
P4 DNA integration 
Establishment of 
P4 immunity 



PLASMID ESTABLISHMENT: 
P4 replication 
P4 late gene expression 
Establishment of multicopy 
plasmid state 

I 



P4 morphogenesis P2-P4 Immune-integrated Plasmid 
Cell lysis lysogen P4 carrier P4 carrier 

P4 phage release 


Figure 26-2 Phage P4 life cycle. See the text for explanation. Redrawn from (69), with permission. 


392 








THE SATELLITE PHAGE P4 


393 


A 


o 

L 


2 4 

J_,_L 


gop P ell int 

i i i ii i r 

cos art err 

>C-1“3 TT 

Pgop *cll Pell Pint *a 


6 8 10 12 kb 

J_i_I_i_I_,_I 


-*o 


a 


cnr'H e 


. sid 8 psu 

ni i Mi i i—i ii i 


GENES 


or/2 oril cos C . TCC 

■ i >upi - 1 

*151 t: ^le Pll Psid *sid 
l imm 


TRANSCRIPTS: 
< ^ EARLY 

- -► LATE 



8.2 8.4 8.6 8.8 9.0 9.2 kb 


kit 

cl 

eta 

vis 



r— ^ 

-* 1 • 1 

S 1 


~\ 

GENES 


seqC 

seqA 



SITES 

f i 

l imm 

i seqB 

l 4 

f i J P 

r LE 


-*/ 

Pll 



T 

■4 - 


Cl RNA 
_ 1 _ 


IMMUNITY 

TRANSCRIPTS 


Figure 26-3 Phage P4 genome. A: Physical and genetic map of P4. B: P4 immunity region. Coordinates are from the 
annotated complete nucleotide seguence of P4 (47) (GenBank accession number X51522). Transcription start and 
termination sites are indicated by bent arrows and hanging open circles, respectively. The arrows beneath the maps in panel 
A and B indicate the early, late and immune phase transcripts from the two main operons. For other explanations see 
table 26-1 and the text. Redrawn from (85), with permission. 


evolved via independent illegitimate integration-excision 
cycles. 

The a operon, which is transcribed leftwards from two 
promoters, P LE and P LL (30), encodes both genes required 
for P4 lytic and plasmid propagation (a and enr, replication: 
E, helper prophage derepression), and the immunity deter¬ 
minants, required to prevent P4 prophage replication. 
Therefore, control of a operon expression is crucial for the 
establishment and maintenance of the different develop¬ 
mental phases and involves the sophisticated mechanisms 
described below. 

The sid operon codes for regulatory and morphogenetic 
proteins involved in plasmid and/or lytic development (26): 
the positive regulator, 8, which can both activate the satellite 
and trans-activate the helper late operons (26, 30, 95), and 
two proteins with a role in P4 head morphogenesis. The 
latter are Sid, which forms a head external scaffold and 
determines the small size of the P4 capsid (1, 74, 89), and 


Psu, a bifunctional protein that suppresses transcription 
termination at Rho-dependent terminators (polarity sup¬ 
pressor: 64, 67, 88) and helps to stabilize the viral particle 
(capsid “decoration" protein: 54, 55). The sid operon is tran¬ 
scribed late after infection from the P sid promoter, which is 
activated by P4- and P2-encoded positive regulators (gp5 
and Ogr, respectively: 26,46). 

The Replicon 

Autonomous P4 DNA replication occurs both in the lytic 
cycle and during plasmid propagation, and is completely 
independent of helper phage functions. Immediately after 
infection, a burst of P4 replication occurs and a large 
number of P4 DNA molecules (> 100) accumulate in the 
host cell. On the contrary, in the plasmid state DNA replica¬ 
tion is controlled so as to maintain a constant copy 
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Table 26-1 Bacteriophage P4 Genes and Functions 

Gene or Site Gene Product and/or Function Encoded 


cos 

P 

r 9 op 
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int 

Pint 

att 

err 

to 

a 

or/2 

enr 

t, 57 

orf151 

E 

kil 


^4 

Cl 

t, 

Ple 

eta 

vis 

Pll 

oril 

Psid 

sid 

8 

psu 

tsid 


Cohesive ends 19 nucleotides long 
Promoter of the gop-p operon 
Causes host cell killing in the absence of p 
Inhibits gop killing 

Rho-independent transcription termination site 
Function unknown. Mutants kill the host cell 
Promoter of the cll gene 
Integrase 

Promoter of the int gene 

Site for integrative recombination 

Required in cis for replication of both oril and o rill replicons 
Rho-independent transcription termination site 

Essential for replication: primase, helicase, oril, and err recognition and binding 
With err supports a-dependent or/7-independent replication (or/// replicon) 

Controls DNA replication and plasmid copy number; interacts with a protein 
Putative transcription terminator 
Function unknown 

Derepression of the P2 helper prophage 
Kills the bacterial host if overexpressed 

Rho-dependent transcription terminator. Elicits strong transcription termination from P L e when the 
Cl RNA is present 

Rho-independent, Cl RNA-dependent transcription termination site 
Prophage immunity. Encodes the Cl RNA 
Rho-independent transcription termination site 
Constitutive promoter 

Function of gene product unknown. Its translation prevents transcription termination from P LL 
Binds Pl L , P si d, and att; negative regulator of P L l! stimulates P sid ; excisionase 
Late promoter; positively regulated by P4 gp8 and P2 Ogr and Cox; negatively regulated by Vis 
Origin of DNA replication 

Late promoter; positively regulated by P4 6 and P2 Ogr, stimulated by Vis 

Small head determination; procapsid external scaffold 

P4 and P2 late promoter activator 

Polarity suppression; capsid decoration protein 

Rho-independent transcription termination site 


number of about 40 P4 genomes per bacterial chromo¬ 
some (2). This control is exerted both by limiting the expres¬ 
sion of the P4 replication protein, gpa, and by directly 
controlling the activity of gpa in DNA replication initiation 
(see below). P4 DNA replication starts at a unique point, oril, 
and proceeds bidirectionally in the 0 form (63). 

The P4 DNA replication system is peculiar among other 
phage or plasmid replication systems in that P4 is largely 
independent of bacterial functions for replication initiation; 
in fact, P4 encodes the multifunctional a protein that exhi¬ 
bits primase and helicase activity, and specifically binds 
DNA sequences at the replication origin (36, 108). Accord¬ 
ingly, host functions involved in DNA replication initiation, 
such as DnaA (initiator), DnaB (helicase), DnaC (DnaB 
complex), and DnaG (primase), as well as the Rep helicase, 
which is involved in the chromosome replication forks pro¬ 
gression, are not required in vivo (6,10, 33, 62, 68). In vitro, 
replication of P4 DNA molecules has been obtained using 
combined mixtures of gpa, DNA polymerase III holoen- 
zyme, single-stranded DNA-binding protein, DNA gyrase, 
and topoisomerase (33). 


The P4 a Protein 

The P4 a gene, essential for P4 DNA replication, encodes 
a 777 amino acid polypeptide. Deletion and point mutants 
have been used to dissect the functional domains of gpa 
by testing their ability to sustain phage replication in vivo 
and/or assaying in vitro the activity of the mutant proteins. 
The modular structure of gpa emerged from this analysis, 
with primase and helicase activities arranged in distinct, 
separable domains (figure 26-4A). 

The primase activity is located in the N-terminal half 
of the a protein. A potential metal-binding region, with two 
CXXC clusters and the EGYATA motif, common to primases 
of conjugative plasmids, is essential for P4 replication in vivo 
and primase activity in vitro. A potential Mg 2+ -binding 
region was also found (96,108,109). 

The helicase activity, characterized by the type A nucleo¬ 
tide binding motif, requires the middle and C-terminal parts 
of gpa. Point mutations in the type A motif are defective 
in phage propagation in vivo and in helicase activity 
in vitro (96,109, 111). 
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Figure 26-4 The essential elements for phage P4 replication. A: Fuctional domains of the P4 a protein. The gray bar 
represents the a protein; its functional domains are indicated by the continous and dotted lines below. Highlighted in white 
is a 120 amino acid region exhibiting sequence similarity to primases of Incl and IncP plasmids. The amino acid consensus 
sequences of conserved motifs are reported on top, and the amino acid changes affecting the different activities are 
listed over the arrows. Pri, primase motif conserved among several prokaryotic primases; Zn 2+ , metal binding motif found 
in other prokaryotic primases and DNA-repair proteins; Mg 2+ , potential metal binding site, has similarity with other DNA 
and RNA polymerases; NBS, type A nucleotide binding site found in other helicases of small DNA and RNA viruses; the Cnr-r 
site, defined by Cnr resistance mutations, interacts with the Cnr-protein. From (109), with permission. B: Replicon oril and 
orill. Plasmids pGM545 and pGM526 carry the essential P4 regions of the two P4 replicons, ligated to the chloramphenicol 
resistance gene. Replicon oril: oril and err sites, and cnr and a genes; replicon orill: ori2 and err sites, and a gene. The genes 
are expressed under plac control. From (103), with permission. C: The minimal oril sequence and the two direct repeats 
of err are reported. The type I iterons are boxed. The arrows indicate the orientation and the number above the arrows 
indicate the position of the first base of each iteron, relative to the first iteron from the left. For or/2, the sequence of the 
minimal 22 bp region is reported. Above the sequence is the five-base substitution that inactivates or/2 (D. Ghisotti, 
unpublished data). 


Other functional domains are located in the C-terminal 
end of gpa that are retained by the truncated C-terminal 
peptide: (i) a DNA-binding domain, which specifically 
binds to oril and err, and (ii) a homodimerization domain 
(104,109). The crystal structure of the a protein C-terminal 
origin-binding domain, recently solved at 0.295 nm resolu¬ 
tion, reveals an overall fold of the winged helix subfamily 
of helix-turn-helix DNA-binding proteins, forming homo¬ 
dimers with pseudo 2-fold symmetry (107). In addition, a 
cluster of mutations that make P4 replication insensitive to 
the negative regulation exerted by the Cnr protein 


(see below), maps to the 3' end of the a gene, suggesting 
that the gpa-Cnr interaction domain is also located in the 
C-terminal part of the protein (110). Interestingly, all 
mutations that make gpa insensitive to Cnr protein map to 
the dimerization interface (107). 

The oril Replicon 

The P4 origin of replication, oril, is bipartite, composed of 
the oril and err sites (figure 26-4B, C). Both oril and err 
contain several direct and inverted repeats of a decameric 
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sequence, the type I herons (GGTGAACAGA/T), which are 
bound by the a protein (36, 63, 103, 108). Moreover, both 
sites contain AT-rich stretches. 

In the P4 genome, the oril and err sites lie about 4500 bp 
apart. The spacing between err and oril can be reduced 
to less than 100 bp without affecting replication (36). 
However, the relative orientation of the two sites is essential 
(20). Moreover, oril cannot substitute for err (103), indicating 
that they have different roles in replication. 

The minimal oril site was limited to a 123 bp region, 
which contains six type I iterons regularly spaced with 
helical periodicity and a central 35 bp AT-rich region (103) 
(figure 26-4C). Both the number and the spacing of the 
type I iterons are essential for replication (103). Thus initia¬ 
tion of P4 replication might occur in a similar way to the 
other iteron-containing origins, such as E. coli oriC or the 
phage PI oriR (11,12,106), in which binding of the initiator 
protein causes DNA bending and wrapping around a core of 
the initiation proteins. In P4, several a proteins bound to the 
oril site in a regular arrangement might constitute 
a nucleoprotein complex competent for replication. The 
central AT-rich region might be involved in specific unwind¬ 
ing and primer synthesis. 

The err site is formed by two 120 bp AT-rich repeats, each 
containing five type I iterons (figure 26-4C). However, a 
single 120 bp err repeat is sufficient to promote replication, 
although the efficiency is reduced about 10-fold (36, 103). 
Type I iterons in err do not show helical periodicity. The a 
protein binds to these iterons, but initiation of replication 
from err has never been observed, either in vivo or in vitro 
(33, 62). err is therefore not a replication start point but 
is essential for replication initiation at oril. 

In the presence of the a protein, looping of P4 DNA 
molecules containing oril and err was observed (33, 108, 
109). This suggests that a proteins, bound to oril and err, 
might interact with each other. Interaction between a -oril 
and a -err might be required for the formation of an active 
replication initiation complex, possibly by the rearrange¬ 
ment of the OL-oril complex to make it competent for initia¬ 
tion of bidirectional replication. 

Control of P4 Replication 

Many plasmids encode feedback systems that negatively 
control DNA replication. P4 also controls replication when 
it propagates in the plasmid state. This control is achieved 
(i) by modulating the expression of the a gene at the tran¬ 
scriptional level (see below) and (ii) by the negative control 
on P4 DNA replication exerted by the product of the P4 cnr 
(copy number regulation) gene (101). The enr gene is located 
immediately upstream of the a gene within the same 
operon and the expression of the two genes is coregulated 
(figure 26-3A) (31, 79). Using a two-hybrid system in yeast 
it was found that Cnr and gpa interact with each other 


(104). Overproduction of the Cnr protein inhibits P4 DNA 
replication, whereas lack of Cnr causes overreplication and 
host cell killing (101). Overall, it appears that a balanced 
expression of the two proteins is necessary for proper P4 
DNA replication (101,104,110). Although cell death does not 
impair the P4 lytic cycle, a cnr defective P4 cannot be main¬ 
tained as a plasmid. The cnr gene is therefore the only 
P4 gene essential exclusively for plasmid propagation. 

P4 mutants insensitive to negative Cnr control (a-cr 
mutants) have been isolated. All such mutants carry 
amino acid substitutions in the C-terminal region of gpa 
(figure 26-4A) (110) and map to the dimerization interface 
of the DNA-binding domain (107). No interaction between 
a-cr mutant proteins and the Cnr protein could be detec¬ 
ted in the two-hybrid system (104). These observations 
indicate that the a protein is the target of Cnr-mediated 
negative regulation and suggest that Cnr may act by inter¬ 
fering with dimerization of the gpa origin binding domain. 

It was shown in vitro that the Cnr protein does not 
bind DNA but stimulates the binding affinity of wild-type 
a protein to oril and err. The a-cr mutant proteins are still 
able to bind specifically to oril and err, but in the presence 
of Cnr the binding of a-cr protein to oril and err is less 
stimulated than the binding of wild-type a protein. No 
apparent effect of Cnr on a primase and helicase activities 
was found (110). In addition, Cnr protein interferes with 
a-a interaction (104,107), suggesting that the Cnr-a-DNA 
complex is not competent for replication initiation unless 
Cnr is released from the complex. The negative regulation 
might interfere with some essential steps, for example a-a 
interaction and/or oril melting. A model for the control of 
P4 replication initiation by Cnr is presented in figure 26-5. 

Two Replicons Coexist in P4 

Deletion of oril and the cnr gene from a minimal oril repli- 
con revealed the presence of an alternative replicon, orill 

(102) . Replication of orill depends upon a protein, requires 
two cis-acting sites, err and oril, and, unlike oril, is comple¬ 
tely inhibited by Cnr protein. The ori2 site has been map¬ 
ped within a 22 bp region, internal to the a-gene coding 
sequence, approximately at the boundary between the 
primase and helicase domains (103) (I. Oliva and D. Ghisotti, 
unpublished data) (figure 26-4C). This observation suggests 
that the multifunctional a gene might originate from the 
fusion of two ancestral genes, coding for primase and heli- 
case-DNA binding functions, respectively, separated by the 
origin of replication. 

The essential elements composing the two mini-replicons 
are diagrammed in figure 26-4B. Both a helicase and 
primase functions are required for replication from orill 

(103) . The ori2 site does not contain type I iterons and does 
not bind a protein (103). Moreover, no putative binding sites 
for other known P4 or E. coli factors, such as a DnaA-box 
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Figure 26-5 A model for P4 replication. Protein a binds to both oril and err sites in oligomeric form and may cause looping of 
P4 DNA by protein-protein interactions. Protein a is thought to make RNA primers (arrowheads) both at the origin of 
replication (priming) and at the replicative forks for lagging-strand priming (elongation). The Cnr protein, interacting with a, 
prevents replication initiation by inhibiting either DNA looping or origin denaturation and priming. 


consensus sequence, are found. Thus, ori2 differs structu¬ 
rally and functionally from oril. This suggests that oril 
and orill replicons may replicate by different mechanisms. 

The initiation site of replication in the orill origin is 
still unidentified. In experiments in vitro aimed at map¬ 
ping the replication start point in the P4 genome, a replica¬ 
tion initiation signal was identified immediately to the 
right of ori2 (63). This signal might represent the replication 
initiation point of the orill replicon. 

Several prokaryotic chromosomes (e.g., plasmid R6K 
and phage T 7) contain multiple origins of DNA replication 
(25, 53, 100). Usually the major origins are used more than 
90% of the time, while minor origins are either used infre¬ 
quently or they remain silent. In the wild-type P4, cnr, coex¬ 
pressed with a, is inhibitory on orill, suggesting that 
this origin is normally silenced (102). Whether the orill 
replicon is a relic of a P4 ancestor or whether it is still 


functional under some physiological conditions remains to 
be ascertained. 


The Regulatory Network 

The different modes of P4 propagation require the differ¬ 
ential expression of replication and morphogenetic func¬ 
tions. This is achieved mainly through complex regulation 
of the a operon, which encodes both immunity and replica¬ 
tion functions in the 5' and 3' regions, respectively, and 
the “late” sid operon. Three regulatory conditions have 
been described (see figure 26-3): 

(i) The uncommitted phase, which immediately fol¬ 
lows P4 infection of E. coli, is characterized by an early tran¬ 
scriptional burst of the entire a operon from the P LE 
promoter, followed by the activation of a negative control 
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system that causes premature transcription termination 
about 300 nucleotides downstream of P LE (immunity 
control, see below). This regulatory mechanism is irreversi¬ 
ble and marks the end of the uncommitted phase. One of 
two alternative states follows: 

(ii) The immune-integrated (lysogenic) state, in which 
P4 gene expression is limited to the P LE proximal part of 
the a operon (immunity region). Replication genes are 
therefore not expressed and the P4 integrated genome is 
passively replicated by the bacterial chromosome. 

(iii) The lytic-plasmid state, in which the positively 
regulated late promoters P LL and P si j are activated by posi¬ 
tive regulators. In this state, transcription from P L l drives 
a new mode of expression of the entire a operon. 

The key aspects of the regulatory circuitry that con¬ 
trols P4 gene expression in its different developmental 
phases are: (i) The a operon is downstream of the strong 
constitutive promoter P LE . As a consequence, expression of 
the replication genes from P LE is regulated at a post¬ 
transcription initiation level by controlling elongation and 
termination, (ii) A highly specific mechanism that effici¬ 
ently terminates all transcripts from P LE is ineluctably and 
irreversibly activated soon after infection. Thus, sub¬ 
sequent activation of replication genes, either in the late 
lytic phase and in the plasmid state or upon induction of P4 
prophage by the helper, requires this mechanism to be 
circumvented, rather than reversed. This occurs by activat¬ 
ing the positively regulated P LL promoter, (iii) Because P LL is 
upstream of P LE , expression of the a operon from P EE 
requires the termination roadblock imposed by immunity 
to be bypassed. This is achieved by translational suppres¬ 
sion of transcription termination, (iv) Regulatory cross¬ 
talk permits P4 to sense the presence of P2 and to adjust 
accordingly both satellite and helper gene expression. 

The Uncommitted Phase and the Path 
to Lysogeny 

In the early, uncommitted phase the a operon is tran¬ 
scribed from the constitutive promoter P LE , the replication 
genes are expressed at high level, and a burst of P4 repli¬ 
cation is observed. Transcription from P LE appears to be 
termination-prone and yields RNA molecules of different 
lengths (see figure 26-3A, B): (i) full-length mRNA, which 
covers the entire operon, (ii) transcripts that stop at tisi, 
a terminator just upstream of the replication genes cnr 
and a, and (iii) a family of transcripts less than 0.5 kb 
long which do not extend beyond two termination sites, t 4 
and t imm , that are located within the kil gene (13, 14, 
31). Thus, different portions of the a operon are expressed 
differentially. 

About 15 minutes after infection, transcripts extend¬ 
ing beyond t 4 —t imm can no longer be detected and all tran¬ 
scripts from P LE terminate either at tj, an intrinsic 


terminator located about 70 bp downstream of P LE , or at 
t 4 -timm- hi this way transcription from P LE allows the 
expression of the replication genes for a restricted time 
after P4 infection, after which transcription is limited 
to the 5' untranslated portion of the operon which encodes 
the immunity functions (immunity region, figure 26-3B). At 
this point the immune mode of transcription from P LE 
is irreversibly established. 

Transition from the early transcription pattern, with 
expression of the entire a operon, to the immune pattern is 
concomitant with the appearance of a small stable RNA 
79 ±1 nucleotides long, the Cl RNA, encoded by the cl gene 
(figures. 26.3B, 26.6 A). The Cl RNA is produced by processing 
of transcripts that cover the immunity region. Moreover, 
the presence in the cell of the Cl RNA is sufficient to 
efficiently cause premature transcription termination at tj 
and t 4 -t imm in an infecting phage: thus the Cl RNA is the 
P4 immunity factor (13,15, 31, 37, 39,42). 

Immunity 

For most known temperate bacteriophages, immunity is 
elicited by a repressor protein which prevents transcription 
initiation at promoter(s) controlling the expression of lytic 
functions. The P4 immunity mechanism is unique among 
the known bacteriophages in several respects: (i) expression 
of the P4 a operon is prevented by premature termination 
of transcription starting at the constitutive promoter, P LE ; 

(ii) the P4 immunity factor is a short, stable RNA (Cl RNA); 

(iii) transcription termination is controlled via RNA-RNA 
interactions between the Cl RNA and two specific target 
sequences on the nascent transcript; (iv) the Cl immunity 
factor is produced by specific processing of the same tran¬ 
script it controls. 

Most recessive mutations that cause an immunity 
defect map in seqB, a region internal to the cl gene. 
seqB exhibits complementarity with two sequences, seqA 
and seqC, located upstream and downstream of seqB, 
respectively (figure 26-3B) (42, 85). The complementarity 
between seqB and both seqA and seqC must be maintained 
in order to achieve timely and efficient transcription termi¬ 
nation. Moreover, mutations in both seqA and seqC that 
restore complementarity with a cl mutation in seqB also 
re-establish efficient transcription termination (42, 85). As a 
result, the seqA and seqC sites in the nascent transcript 
appear to be the targets of the Cl RNA, as both are required 
for the establishment and the maintenance of prophage 
immunity. Interactions between the Cl RNA and target 
RNAs appear to occur primarily between complementary 
regions that computer analysis predicts to be single- 
stranded (14,85) (figure 26-6C). 

The above model is further supported by the analysis 
of the immunity system of phage ®R73. This P4-like phage 
is heteroimmune to P4 and has provided the opportunity 
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Figure 26-6 Predicted secondary structure of the Cl RNA and possible interactions in the leader transcript of the immunity 
region. A: Computer predicted structure of the Cl RNA. The minimal sequence necessary for Cl RNA maturation is reported 
and the 5'- and 3' ends of the mature Cl RNA molecule are indicated by brackets. Mutations that affect Cl activity or 
processing are indicated; c/478 is a silent mutation and cl483 is a suppressor of ash3 (39). From (39), with permission. 

B: Predicted secondary structures of the P L E-t; transcript. The alternative structures and their free energy are reported. 
Boldface: bases complementary to Cl RNA single-stranded regions; italics: bases involved in t 7 stem formation. From (14), 
with permission. C: Possible interactions between the Cl RNA and the SeqA and SeqC targets in the leader region. The 
complementary bases in single stranded regions of Cl RNA are connected by lines. The base changes in the P4-like 
retronphage <t»R73 are boxed. The arrow indicates an insertion, A a deletion. Redrawn from (86), with permission. 
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to compare similar immunity systems with altered specifi¬ 
city (86). The <1>R73 Cl RNA differs from the P4 Cl RNA 
in six bases, all in seqB: two in the bulge and four in the 
major loop. It is noteworthy that for five of these changes 
there is a complementary base substitution in both seqA 
and seqC, whereas the sixth, a C-to-U transition in the 
major loop, is still compatible with pairing with the G in 
seqA and seqC (figure 26-6C). 

How these RNA-RNA interactions between seqB and 
its targets, seqA and seqC, control transcription elongation is 
not completely understood. The complementary sequences 
in the immunity region may allow both intra- and inter- 
molecular pairing of RNA molecules. In the uncommitted 
phase, transcription of the entire a operon occurs before 
the appearance of Cl RNA. As soon as the mature Cl RNA is 
produced in the cell, the efficiency of termination within the 
immunity region increases, completely preventing tran¬ 
scription of the downstream replication genes (13, 14, 31). 
These findings suggest that intramolecular interactions 
in the leader sequence allow read-through, whereas inter- 
molecular interactions between the Cl RNA and the nascent 
leader transcript cause strong termination. 

seqA is located immediately upstream of t 2 . It has 
been shown that termination at t 2 is enhanced by the Cl 
RNA both in vivo and in vitro: Cl-dependent termination 
at t 2 requires a wild-type seqA sequence, strongly suggest¬ 
ing that the pairing between seqA and the Cl RNA may 
induce termination at f 2 . It has been proposed that this inter¬ 
action might prevent the formation of an antitermination 
structure on the nascent transcript, favoring f 2 folding 
(figure 26-6B). As t 2 is located upstream of cl, the Cl RNA 
can autoregulate its own expression by reducing transcrip¬ 
tion of the cl gene (14). 

seqC overlaps the ribosome binding site and the start 
codon of kil, the first translated gene in the P LE transcript. 
When the nascent transcript is translated, termination at 
t 4 and t imm does not occur. This suggests that seqB-seqC 
could indirectly control termination at these sites by regu¬ 
lating kil translation. In particular, lack of translation may 
allow the Rho factor access to t imm , thus causing transcrip¬ 
tion termination by polar effect (38). t 2 and t imm are not 
essential for the maintenance of immunity, whereas t 4 
appears to be the main terminator necessary to prevent 
expression of the lytic genes in the lysogenic condition. 
The molecular mechanism of termination at t 4 , which does 
not show any feature of either Rho-dependent or intrinsic 
terminator, remains to be clarified (14). 

Cl RNA maturation from the primary P LE transcript 
requires RNase P and polynucleotide phosphorylase 
(PNPase). RNase P, an endonucleolytic ribozyme also 
involved in maturation of tRNAs, generates the 5' end 
of the mature molecule (37). The 3' end maturation is 
promoted by PNPase, and seems to be facilitated by poly- 
adenylpolymerase I (PAP I). Interestingly, mutants in 
PNPase and PAP I, which are involved in 3' end processing, 


are also defective in 5' end RNA maturation of Cl RNA, 
suggesting that 3' end processing is required to generate 
the substrate for RNase P. RNase E also participates in the 
maturation of Cl RNA, although its role is not essential (15, 
76). 

It is interesting to note that a similar regulatory mod¬ 
ule, based on transcription termination controlled by a 
small pseudo-antisense RNA, is structurally and funct¬ 
ionally conserved in other systems unrelated to P4, such 
as bacteriophage N15 and PI, which also share with P4 
the capability to be maintained as autonomously replicat¬ 
ing plasmids. In all these systems the genes immedi¬ 
ately downstream of and controlled by the Cl-Iike RNAs 
are functionally related: the first gene is an inhibitor of 
cell division ( kil in phages P4 and N15, icd in phage PI) 
and the second is an anti-repressor protein (in P4 the 
E gene encodes the anti-repressor of the P2 helper phage, 
whereas PI and N15 express an anti-self-immunity system 
(23, 40, 82, 84)). Functional and structural similarity, how¬ 
ever, is not accompanied by sequence similarity, which is 
limited to the Cl-like RNA. We think that these operons 
derive from an ancestral “anti-immunity module” that 
was acquired through horizontal gene transfer and evol¬ 
ved independently under different selective pressure, and 
that the regulatory RNA, which has been conserved to a 
greater extent, was subject to more stringent structural 
and functional constraints. 

The control of gene expression found in these “anti¬ 
immunity modules," based on a “pseudo-antisense” 
RNA that controls transcription termination, appears to 
be very effective in obtaining a transient expression of 
potentially lethal genes upon infection of a naive cell. 
By sequence-structure similarity searches in sequence 
databases, potential Cl-like RNAs have been found in 
a Shigella jlexneri prophage-related sequence, in Incll 
plasmids, and in the Acinetohacter chromosome (38, 82), 
suggesting a widespread diffusion of this mechanism 
of genetic control. 

Integration 

In addition to establishing the immune condition that pre¬ 
vents expression of the replication genes, lysogenization 
by P4 requires the integration of the P4 genome into the 
bacterial chromosome. Integration occurs according to the 
Campbell model by site-specific recombination between 
the P4 and the bacterial att sites (see chapter 7): the latter 
corresponds to the 3' end of a gene ( leuX) encoding a 
tRNA Leu isoacceptor. Recombination occurs within a 20 
nucleotide long, GC-rich core region identical in both 
phage and host att sites, and this preserves the integrity of 
the leuX gene sequence upon prophage integration (78). 

Integration requires the integrase, coded by the P4 int 
gene (17a, 77) (figure 26-3A). This gene, with coding 
capacity for a 426 amino acid protein, is located to the left 
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of attP and is transcribed leftwards (78) (D. Piazzolla and 
D. Ghisotti, unpublished data). P4 integrase belongs to 
the highly divergent family of site-specific recombinases, 
which includes the well-characterized X integrase (3). The 
C-terminal half of the protein, which is directly involved in 
the recombination event, is particularly well conserved 
in all integrases of the X family, whereas the N-terminal 
region that specifically binds the arm sequences of the att 
sites is more divergent. In the P4 att site a pair of 16 bp 
direct repeats, present on either side of the core sequence, 
and an inverted repeat in the left arm are supposed to be 
bound by Int (78). A consensus sequence for IHF is present 
in the right arm of attP, suggesting that this bacterial protein 
is part of the P4 integration complex. 

The int promoter shows canonical a /0 consensus 
sequences and is active early after P4 infection. At later 
stages and in the lysogenic state, P in t activity is low (17b); 
it has been suggested that Int autoregulates its own expres¬ 
sion by binding to the attP left arm sequences that overlap 
the P int region (78). 

P4 integrase is necessary for both P4 integration and 
excision (77, 78). This latter event occurs spontaneously 
at low frequency and, at much higher efficiency, upon 
infection of a P4 lysogen by the helper phage P2 (87). 
Excision requires a second P4 protein, Vis, a small (88 
amino acids) basic protein that presents a helix-turn-helix 
motif and binds the P4 attP region (17b, 79). Vis is expressed 
from P LL , the P4 late promoter that is fraus-activated by 
phage P2 (see below). Therefore, the first response of a P4 
prophage to P2 infection is to promote excision and the 
lytic cycle. 

Like most excisionase proteins, Vis binding causes 
bending of DNA (17b), suggesting that Vis actively partici¬ 
pates in the formation of the protein-DNA complex invol¬ 
ved in excision. Moreover, Vis is a regulatory protein that 
controls transcription from P LL , P si( j (see below), and prob¬ 
ably P int , thus being involved in the control of int expression 
(79) (D. Piazzolla and D. Ghisotti, unpublished data). 

Integration within known or putative tRNA genes 
appears to be a widespread phenomenon among both 
prokaryotic and eukaryotic mobile genetic elements such 
as viruses, plasmids, and transposons (19, 83); this may 
reflect the structural and/or mechanistic features offered 
by tRNA genes that may be exploited for evolution of 
integration systems. 

P4-like cryptic prophages, or simply P4 integrase homo¬ 
logs integrated into a tRNA gene and associated with 
clusters containing accessory bacterial functions, have 
been found repeatedly in Gram-negative species. The 
functions associated with the above elements include 
retrons, modification-restriction systems, pathogenicity 
and symbiosis islands, and catabolic pathways (5, 9, 16, 18, 
24, 49, 60, 61, 78, 81, 97, 99). These findings testify to a wide 
diffusion of P4-Iike genetic elements and their direct invol¬ 
vement in bacterial evolution. 


The Late-Plasmid Transcription Pattern 

In P4, the choice between lytic plasmid versus lysogenic 
development relies on the presence or absence of activation 
of the two late promoters, P LL and P sid . The lytic plasmid 
and the immune modes of transcription are not mutually 
exclusive; rather, the former modes are superimposed on 
the immunity control. Although the patterns of gene expres¬ 
sion in the plasmid and lytic cycles are essentially super- 
imposable, the mechanisms leading to the activation as 
well as the outcome of these two conditions are markedly 
different. 

The Multicopy Plasmid State 

Maintenance of P4 in the plasmid state requires that two 
main conditions are satisfied: (i) the immune regulation 
that prevents expression of the plasmid replication funct¬ 
ions must be bypassed, and (ii) P4 plasmid replication 
must be controlled in order to maintain a number of 
P4 genomes compatible with survival of the bacterial 
host. In the plasmid state P LL drives the expression of 
the a operon and thus its regulation is crucial for the 
establishment and maintenance of the plasmid condition. 
Homeostatic control of P LL seems to be achieved by the 
opposing actions of gpS, expressed from P sld , and Vis, 
encoded by the first gene transcribed from P LE itself. 
These two proteins activate and repress P LL , respectively 
(29, 30, 79). 

P LL is located 400 nucleotides upstream of P LE (30). 
When transcription starts at P LL two additional genes, 
vis and eta, are expressed. Consequently, the transcription 
termination barrier imposed by the immunity system 
when transcription initiates at P LE rather than P LL is 
bypassed via translational suppression of transcription 
termination (38) (figure 26-3). The start codon of vis is 
located about 50 nucleotides downstream of P LL ; the 
start codon of eta is partially overlapped by the vis stop 
codon, and the two genes appear to be translationally 
coupled (38, 79). eta extends through the immunity region, 
overlapping kil in frame (Kil is therefore a truncated form 
of Eta). Thus, transcripts starting at P LL , unlike tran¬ 
scripts starting at P LE , do not have an untranslated 
leader region. Nonsense mutations in vis or eta cause 
premature termination of transcription about 700 
nucleotides downstream of P EE , suggesting that ribosome 
protection of the RNA transcribed from P LL may prevent 
interaction between the transcript and the Cl RNA and/ 
or the Rho factor, thus impeding transcription termina¬ 
tion (38). Therefore it appears that immunity is the 
default program inescapably expressed by an infecting 
P4 phage. To enter the alternative life-style (plasmid 
state or lytic cycle), P4 simply bypasses the immunity 
mechanism by activating termination-insensitive trans¬ 
cription from an upstream promoter, P LL . 
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The termination insensitivity necessary to enter into 
the plasmid state does not extend to tm, which is found 
downstream of the promoter P LE (see figure 26-3). As 
a result, the expression of the a gene, which is found 
downstream of f /5/ , may still be inhibited. Differential 
expression of replication and morphogenetic functions, 
as controlled by the a protein, can consequently occur 
even while P4 is in the plasmid state (30, 38). 

Upon infection by wild-type P4, the multicopy plasmid 
state may be established at a low frequency, the lysogenic 
cycle being preferred in the absence of a helper phage 
genome. Mutations affecting P4 immunity may increase 
the frequency of plasmid establishment. However, the inabil¬ 
ity to establish immunity does not necessarily direct P4 to 
plasmid propagation, since uncontrolled expression of the 
kil gene from P LE is lethal to the host (38). As a consequence, 
P4 cl and P4 seqA/seqC mutants, in addition to being unable 
to lysogenize, kill the majority of the infected cells and enter 
the plasmid state at a low frequency, comparable to that 
of the wild-type phage (2, 85). On the contrary, the P4 cl kill 
double mutant establishes the plasmid state in about 100% 
of infected cells (2). This would indicate that establish¬ 
ment of plasmid propagation requires the shut-off of kil 
expression. 

Transcription from P LL is essential for expression of 
plasmid replication functions in the plasmid condition. In 
fact, P4 vis nonsense mutants, in which transcription 
from P LL terminates prematurely, and P4 8 mutants, in 
which P LL cannot be activated, are impeded in plasmid 
propagation (29, 30, 38) whereas activation of transcrip¬ 
tion from P LL by promoter-up mutations, which make 
transcription independent of the positive regulators, 
increase the frequency of P4 plasmid establishment 
(28, 30, 85). 

The expression level of the plasmid replication genes 
from P LL is lower than from P EE . This appears to depend on 
negative regulation of P LL transcription by the Vis protein: 
expression of a cloned Vis gene causes a dramatic inhibition 
of transcription from P LL , it inhibits P4 propagation in 
the plasmid state, and suppresses the virulent phenotype of 
P4 virl, a promoter-up mutation that makes P LL indepen¬ 
dent of its activators (66, 79). The indirect control of the P4 
plasmid replication rate due to negative regulation on P LL 
transcription is not sufficient for P4 maintenance in the 
plasmid state. A more direct control on P4 plasmid replica¬ 
tion due to the Cnr protein (see above) is necessary for 
copy number control of plasmid P4. 

A low basal level of transcription from P LL may be 
detected soon after infection, due to an overlapping weak 
promoter P LL * (29). However, no basal transcription has 
been detected from P sid . This poses the unsolved problem of 
how P sid transcription (and, as a consequence, the plas¬ 
mid regulatory state) is primed upon infection, when gp8 
is not yet present in the cell. It is possible that the right 
operon may be transcribed at a very low basal level in a 


8-independent manner. Alternatively, either the P4 Vis 
protein or other unidentified P4 functions might be 
involved. 

A possible question is whether the P4 plasmid condi¬ 
tion is simply an abortive lytic cycle of no biological signifi¬ 
cance. In the laboratory, in the absence of the helper, the 
lysogenic condition is largely preferred and is more stable, 
thanks to the efficient integration and immunity mechan¬ 
isms, and maintaining the wild-type P4 as a plasmid 
requires the continuous selection of plasmid carrier clones, 
which grow more slowly than the lysogens. It therefore 
seems likely that also in nature lysogeny is the preferred 
condition and the plasmid state might be transiently estab¬ 
lished upon P4 infection of a host lacking a helper, or upon 
abortive infection of a P4 lysogen by a helper phage. 

The Lytic Cycle 

Satellite-Helper Regulatory Interactions 

The P4 lytic cycle requires efficient exploitation of the 
helper phage genetic information to obtain the morpho¬ 
genetic and lysis gene products. This exploitation may occur 
under different scenarios: (i) P4 infecting a repressed P2 
lysogen: (ii) P4 and P2 co-infecting a non-lysogenic host: 
(iii) a P4 lysogen being infected by phage P2; (iv) a P4- 
plasmid carrier being infected by phage P2. In each of these 
situations, the satellite phage senses the presence of the 
helper and responds by activating its own functions that 
in turn will modify the pattern of gene expression of the 
helper. Reciprocal regulatory interactions developed in this 
satellite-helper system involve lifting the immunity 
mechanisms (mutual derepression) and direct activation of 
the late operons (reciprocal trans-activation) of both phages 
(see figure 26-7). 

P2 prophage derepression by phage P4 occurs via direct 
inactivation of the helper repressor by the P4 £ gene product. 
It appears that the gp £ dimeric protein interacts with the 
dimeric C repressor of phage P2 to form gp £-C multimeric 
complexes that interfere with the binding of P2 repressor 
to its operator (40, 70, 71, 93). On the contrary, P2 may dere¬ 
press the P4 satellite prophage via activating the P LL late 
promoter by the helper early gene product Cox, thus bypass¬ 
ing the P4 immunity system. The Cox binding region is 
located —60 to —150 in P LL (28,87,93). 

The P4 late promoters P LL and P sid , and the P2 late 
promoters of the morphopoietic operons, may be recipro¬ 
cally frans-activated by the P4-encoded gp8 and the P2- 
encoded Ogr transcriptional activators (26, 30, 46). Ogr is 
the prototype of a class of transcriptional activators that 
control the late operons of phages related to either P2 or P4 
(46). These activators are small proteins (72-81 amino acids) 
with a zinc-finger-like metal-binding domain. Like most 
members of this family, phage ®R73 gp8 is composed of 
a single module (52, 56). Interestingly, P4 gpS is twice 
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Figure 26-7 Regulatory loops controlling expression of P4 and P2 late genes. Straight arrows indicate transcription units. 
Transcriptional or functional activation and inhibition are indicated by bent lines ending with an arrowhead or with a minus 
sign, respectively. The P2 map is not drawn to scale. Redrawn from (69), with permission. 


the size of the other members of the family; it contains two 
zinc-binding motifs and appears to be a tandem duplication 
of the basic Ogr module (48). Both domains of gpS are 
required for transcriptional activity (57). 

The Ogr-family activators interact with the C-terminal 
domain of the RNA polymerase a subunit and bind to a 
consensus sequence centered at about —55 nucleotides 
from the transcription start point in all the responsive a /(1 
-dependent promoters (45, 56). The P4 P sid promoter, 
however, contains a second copy of the consensus sequence 
in the —18 region that appears to quench transcription acti¬ 
vation by 8, as mutations in the P sid —18 region increase 
8-dependent promoter activity (83). This seems to explain 
why 5-promoted transcription is less efficient from P sid 
than from P2 late promoters and might be instrumental in 
regulating 5 expression in the plasmid condition. 

In addition to these regulatory interactions, morpho¬ 
genetic control is elicited by P4 in order to assemble a 
capsid of the correct size (see below). The type and timing of 
the satellite-helper crosstalk vary under the different infec¬ 
tion conditions and give different outputs, in terms of the 
satellite versus helper virions produced, that appear to fit 
the reproductive strategies of the satellite phage. Details 
associated with these different infection conditions are 
considered individually as follows: 

(i) Infection of a P2 lysogen by P4. When P4 infects 
a repressed P2 lysogen and the lytic option is set, the 


helper prophage immunity is lifted by P4 gpa Phage P2 
derepression leads to expression of the helper early genes 
(cox and the replication genes A and B), and to activation 
of unidirectional P2 DNA replication in situ, without exci¬ 
sion of the integrated prophage genome (40, 93). P2 late 
gene expression (morphogenesis and lysis) may therefore 
ensue via the normal P2 activation mechanism, which 
requires P2 DNA replication and Ogr transcriptional activa¬ 
tor. Expression of P2 morphogenesis and lysis genes is 
both necessary and sufficient for the completion of the P4 
lytic cycle (40, 46, 95). P4, however, may efficiently trans- 
activate P2 late gene expression, bypassing the need 
for P2 replication and the positive regulator, Ogr. 
Trans-activation can occur because P4 gp8 activates P2 
transcription from the same promoters as those used by 
the P2 phage Ogr transcriptional activator (21, 22,46,95). 

The P4 £ gene is essential for P4 growth in a repressed 
P2 lysogen, but not in co-infection with P2 of a non- 
lysogenic cell (32). Derepression appears to be necessary for 
the timely activation of the helper’s morphogenetic operons 
by the P2 Ogr and/or the P4 gpS proteins. Either transcrip¬ 
tional activator is sufficient to traus-activate both P4 
and P2 late genes, although in the presence of functional 
Ogr and gp8 proteins the P4 lytic cycle occurs more effi¬ 
ciently than with a single activator (95). 

The low P2 excision frequency and the interference 
with P2 growth due to the efficient production of small 
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capsids (7, 32, 89) together appear to be responsible for 
the production of mostly P4 rather than P2 virions. 
Thus, in an environment of P2 lysogenic bacteria, where 
P2 replication would be inhibited by the P2 prophage 
immunity, all the morphogenetic potential is efficiently 
directed toward P4 propagation. 

(ii) P4 and P2 co-infecting a non-lysogenic host. When 
P4 and P2 co-infect a non-lysogenic host, the burst size 
of P2 is reduced and progeny of both phages is produced 
in about the same proportions. Interference with P2 
growth is stronger if P4 infects the host earlier than P2. 
The gps derepression activity is dispensable for the P4 
lytic cycle, although it appears to be required to interfere 
with the helpers growth (32). 

(iii) Infection of P4 lysogen by P2. Infection of a P4 
lysogenic strain by phage P2 virions induces the P4 
lytic cycle. However, the yield of the P4 phage is low 
(ca. 1 per cell) and interference on P2 growth is not 
observed. The P4 genome can therefore be rescued 
under such conditions, albeit at low efficiency, and can 
survive phage P2-mediated host cell death. The P4 proph¬ 
age senses the infection of the helper via the P2 early 
protein, Cox, which activates P LL and thus the expression 
of the a operon. Cox does not activate P sid , which lacks 
the Cox binding site. At a later stage after infection, 
P sl d may be efficiently activated by the P2 Ogr protein 
(87, 93). 

(iv) Infection of a P4-plasmid carrier by P2. When P2 
infects a cell carrying P4 in the plasmid condition, the 
Cox function is not required to induce P4 lytic cycle 
since the P4 late promoters are already activated (2, 28). 
In such conditions P4 strongly interferes with P2 
growth. 

P4 Morphogenesis 

The final step in satellite-helper interactions is at the level 
of viral particle assembly. Phage P2 packages its 33 kb 
genome into 60 nm isometric icosahedral capsids consist¬ 
ing of 60 hexameric and 12 pentameric capsomers (T = 7). 
The capsomers are made of the N-terminal processed 
products (gpN*) of the P2 N gene. The capsid maturation 
pathway involves (i) assembly of a procapsid made of 
gpN, the internal scaffolding protein gpO, and the tail 
connector at one vertex: (ii) proteolytic processing of the 
procapsid proteins accompanied by the structural reorga¬ 
nization of the procapsid shell lattice; (iii) packaging of the 
33 kb long P2 DNA molecule. Pre-assembled tails are then 
added to the connector vertex (see chapter 25). P4 
subverts the P2 morphogenetic pathway by directing the 
assembly of a 45 nm isometric capsid (consisting of 30 
hexamers and 12 pentamers, T = 4) more suitable for its 
own genome, which is a third of the phage P2 genome 
size. A schematic model for P4 and P2 capsid assembly is 
shown in figure 26-1C. 


Assembly of Small Heads 

The choice of the T = 4 morphogenetic pathway is 
dictated by the presence of the P4 sid (size determina¬ 
tion) gene product. P4 sid mutants are unable to form 
small capsids. They package dimers and trimers of P4 
DNA into P2-size heads (89). Mutations in P2 that make 
the head assembly process resistant to the effect of Sid 
map in P2 N gene ( sir mutants: sid responsiveness) (94). 
The P2 Sir phenotype may be suppressed by second-site 
mutations in P4 sid (supersid mutations; (59). This genetic 
analysis suggests that P4 Sid protein interacts with P2 
gpN and alters the “normal” P2-directed head assembly 
geometry. Coexpression in E. coli of phage P2 N and P4 
sid genes only, leads to the production of small procapsid¬ 
like particles (35, 73, 74). In three-dimensional reconstruc¬ 
tions, Sid forms a continuous dodecahedral scaffold on 
the outside of the particles. This external scaffold 
connects at the 3-fold axes and makes contacts with the 
underlying shell on the gpN hexamers (74) (figure 26-1B). 
Similar structures may be obtained in vitro from purified 
gpN and Sid proteins and in vivo upon coexpression of 
both genes (35, 105). It is plausible that Sid acts on the 
subassembly of the hexameric capsomers which, in turn, 
determine the final geometry of hexamer and pentamer 
assembly. It should be noted, however, that in vivo the 
Sid external scaffold does not substitute for the P2- 
encoded gpO internal scaffolding protein, which is essen¬ 
tial for the production of both P2 and P4 virions (65, 92). 

Maturation of the procapsid involves the interdepen¬ 
dent proteolysis of the P2 gpN shell and gpO internal 
scaffold proteins, and the release of the Sid external scaf¬ 
fold (65, 72, 73). In addition, the Psu capsid “decoration” 
protein associates on the top of the hexameric capso¬ 
mers and stabilizes the phage particle. Psu is the only 
P4-encoded protein that may be found in the mature 
P4 virion, but it is not essential for P4 morphogenesis 
(54, 55). 

DNA Packaging 

P4 dependence on the helper for the morphogenetic 
pathway includes DNA maturation and packaging. 
Circular DNA molecules, the product of both P2 and P4 
replication pathways, are the preferred substrate for packa¬ 
ging by both P2 and P4 proheads. A staggered double¬ 
strand cut that generates the 19 nucleotide long single- 
stranded cohesive ends of the mature genome occurs 
within a 55 nucleotide long sequence, essential for DNA 
packaging, that is identical in P2 and P4 (10, 80, 112) 
(see chapter 25 and references therein). 

From the above description it is apparent that P4 
has developed sophisticated regulatory strategies to 
exploit P2 and maximize the chances for its own hori¬ 
zontal transmission in bacterial populations that may 
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differ with regard to the presence and the state of the 
helper phage. Clearly, P4 appears as an episome that can 
propagate horizontally by a specialized system of “general¬ 
ized transduction.” We speculate that P4 evolved from 
an ancestral plasmid replicon by independent acqui¬ 
sition of an “integration module" (att, int) for its stable main¬ 
tenance in the host chromosome, and a “transduction 
module” (cos and, probably later on, sid) for packaging 
in the transducing helper phage. The interaction with 
the helper phage has been the driving force in the evolu¬ 
tion of the lytic-plasmid regulatory circuitry that adapted 
to the helper-phage late regulatory mechanisms. 
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Bacteriophage X and its Genetic 
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B acteriophage X (lambda) was discovered at the dawn of 
molecular biology as an inadvertent byproduct of 
studies of the genetics of Escherichia coli K-12, where X 
resides as a prophage (221). Lambda has never been indepen¬ 
dently isolated from nature a second time, but because of 
its particularly fortuitous choice of time and place to 
make its appearance, it became one of the very small 
number of experimental systems that were used in elucidat¬ 
ing our most basic understanding of biological organisms at 
the molecular level. It has been said for many years, and 
may still be true, that more scientist-years per base pair 
have been devoted to understanding the biology of phage 
X than is the case for any other organism, and this is due 
only in part to the modest size of its genome (48,503 bp). 

Research on phage X has made major contributions 
to studies of regulation of transcription, regulatory circuitry, 
mechanisms of recombination of both the homologous and 
site-specific varieties, mechanisms of DNA replication, 
genetic transduction, cell lysis, virion structure and assem¬ 
bly mechanisms, viral evolution, and DNA sequencing and 
cloning technologies, to name just a few highlights. Each of 
these topics has been the subject of one or more major 
reviews, and there have been two books devoted exclusively 
to X biology (142, 156). It is therefore not our aim to 
reproduce here all of the accumulated wisdom from more 
than 50 years of phage X research. 

Instead, we will give a necessarily abbreviated over¬ 
view of X's life cycle and genetic functions. This will be 
followed by a more detailed consideration of a topic that has 
received increasing attention recently and one for which 
our present level of understanding is made possible only by 
the availability of genomic sequence data for a number of 
related phages. That topic is the genetic and evolutionary 
relationships between X and a group of similar phages, 
sometimes referred to as the lambdoid phages and most 
recently comprehensively reviewed by Campbell (43). These 
phages have for the most part not been studied in the 


same detail as phage X, but the data that are available 
for them—DNA sequence as well as biochemical and genetic 
data—provides a new dimension to the accumulated data 
about X itself. It becomes possible in this way to see X not 
simply as a laboratory phenomenon but as a member of 
a natural population of phages, interacting and evolving 
in the natural environment. 

References were chosen in this chapter to allow entry 
into the literature and may not always credit those 
who made discoveries. The relative emphasis given to differ¬ 
ent topics unavoidably follows the authors’ expertise and 
interests. 

A X Overview 

Biological Overview 

Virions 

The heads of phage X virions (figure 27-1) are icosahe- 
drally symmetric, isometric, and about 60 nm in diameter. 
Its non contractile tails are 150 nm long, and they appear 
slightly flexible based on how they lie on an electron micro¬ 
scope sample grid. The main body of the tail (the “shaft” or 
"tube”) consists of stacked protein disks which join to the 
base of the conical “tail tip.” This morphology—isometric 
head, long non-contractile tail—is the most common 
among the double-stranded DNA phages that have been 
isolated and characterized (see chapter 2 for an overview of 
phage classification). Phage X has two types of tail fiber: 
one short fiber extending from the center of the tail tip 
and four long, jointed “side tail fibers,” attached at about 
the junction between the tail shaft and the tail tip. 

The double-stranded DNA chromosome, tightly packed 
into the head without bound proteins, is nominally 48,503 
bp long (the 48,502 bp of the sequenced laboratory version 
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Figure 27-1 The bacteriophage X virion. Electron 
micrograph of X, negatively stained with uranyl acetate. 
Scale: the length of the tail, excluding fibers, is approximately 
150 nm. Micrograph courtesy of Robert Duda. 


of X (314), corrected for the lbp deletion in the stf gene 
(see below) that was introduced in early laboratory 
manipulations of the phage (149)). The ends of the linear 
DNA molecule in the virion have 12 nucleotide single- 
stranded 5'-extensions; these are referred to as “cohesive 
ends” because of the base-pairing complementarity of the 
extensions at the two ends of the chromosome (155,402). 

Closer examination of virions shows that the heads 
have two main protein components: gpE (“gene product of 
gene E”) and gpD, arranged in T = 7 laevo icosahedral 
symmetry, and present in probably exactly 405 copies 
each (63, 88). GpE. also known as coat protein, makes up 
the main structure of the shell, and the smaller gpD subunits 
are clustered as trimers bound to the exterior at the 3-fold 
and quasi-3-fold sites in the gpE lattice, where they 
strengthen the capsid. Quantitatively “minor” components 
of the head include the portal (also called the “head-tail 
connector”), a 12 gpB-subunit ring structure that occupies 
a position at only one of the 5-fold symmetric “corners” of 
the shell (the corner to which tails will join) (210, 353), 
about six copies each of gpW and gpFIl, which provide 
a structural transition between the portal and the tail (46, 
267), and about five copies each of two similar proteins 
(“XI"and “X2”) that are constructed during capsid matura¬ 
tion and consist of a covalent fusion between a portion 
of major capsid subunit, gpE, and a portion of putative 
protease, gpC (147). 

The virions tail shaft consists of 32 stacked hexameric 
rings of the major tail subunit, gpV corresponding to the 
morphologically defined disks (38, 40, 63, 195). The lumen 
of the tail tube is largely filled with about six copies of 


a proteolytically processed form of the tail-length tape 
measure protein, gpH (145, 354), though the right end of 
the DNA has descended a short distance into the head- 
proximal end of the tail (66, 350). The tail tip contains a few 
molecules each of the products of genes /, L, and M (and 
perhaps I and K) (189, 195). Of these, gpj is present in 
about three copies in the mature virion and constitutes 
the central tail fiber and a substantial portion of the coni¬ 
cal tip. The head-proximal end of the tail has a few copies 
of gpU, probably a hexameric ring on top of the stack 
of gpV rings that make up the shaft (197, 199). The last 
gene in the tail assembly pathway, Z, is required not 
for joining tails to heads but for joining them producti¬ 
vely (56); it is not clear whether gpZ is present in mature 
virions. 

Lytic and Lysogenic Life-styles. 

As a temperate phage, X has two alternative life-styles or 
“growth cycles” available to it. In the lytic cycle, the phage 
infects the cell by inserting its DNA into the cytoplasm. 
There it is circularized. An orderly expression of phage 
genes ensues, with the result that the phage uses the 
energy of the host’s metabolism and its biosynthetic machin¬ 
ery to produce 50-100 progeny virions. The expression of 
certain genes eventually causes cell lysis and the release of 
progeny phages to infect new cells and repeat the cycle. 
This process is described in more detail below. 

The lysogenic cycle starts, like the lytic cycle, by infection, 
but the majority of the phage genes become repressed soon 
after infection by the action of a phage-encoded repressor, 
the product of the cl gene. (Note to the X novice: The “I” in 
the gene name “cl” is a Roman numeral one (182). Thus 
the name of the gene, as well as the name of the repressor 
protein, is pronounced “see-one.” Pronouncing it as “see- 
eye” is an error, and doing so in public may elicit ribald 
laughter and ridicule from those who know better.) During 
establishment of the repressed state, the X chromosome 
becomes integrated into the continuity of the bacterial chro¬ 
mosome. Once this occurs, the phage DNA, now known as 
a “prophage,” is replicated passively and distributed to 
daughter cells as a part of the host chromosome. 

A cell carrying a X prophage is said to be a “X lysogen” 
or to be “lysogenic for X." This association can persist for 
an indefinite number of generations with no apparent 
harm to the host (95, 227), because all the lytic genes of the 
prophage are repressed. This life-style can be considered a 
“cycle" because, after an indeterminate period as a quiescent 
prophage, the prophage DNA can become “induced” to 
lytic growth mode and make progeny phage. Induction 
results from loss of repression by the Cl repressor (see 
below). Discussion of lysogeny in general, and regulation of 
the lambda life-cycle in particular, can also be found in 
chapters 7-9. 
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Repressor and Lysogenic Conversion 

The cytoplasm of a lysogenic cell has a concentration of 
Cl repressor that is sufficient to keep the resident prophages 
lytic genes repressed; as a result, if another X virion injects 
its DNA into the cell, it will immediately also be repressed 
and therefore be unable either to enter the lytic cycle or 
to make the enzyme (integrase) that would allow it to 
become a part of the host chromosome and ensure its repli¬ 
cation in the lysogenic cycle. A X lysogen is thus said to 
be immune to infection by X Prophage immunity is specific 
to the phage since it is based on the specific binding between 
the repressor and its cognate operators. Thus, phage 434, 
a close relative of phage X, is unaffected by the presence 
of a X prophage in a cell it infects because of the different 
specificity of binding between its repressor and operators, 
and conversely X can infect a 434 lysogen. 

The change in the phenotype of a lysogenic cell that 
results from the expression of the repressor by the proph¬ 
age—that is, immunity to infection by a phage with the 
same repressor specificity—is a specific example of the 
more general phenomenon of lysogenic conversion. Five 
other genes in X are known to be expressed from an other¬ 
wise repressed prophage and as a result to change the 
phenotypic properties of the host cell; in other words, they 
cause lysogenic conversion. These include the rexA and 
rexB genes (263, 351), whose products block successful 
infection by several phages including rll mutants of phage 
T4 (and which thereby played a central role in some of the 
classic early experiments of molecular biology; 24, 25), the 


sieB gene (283), whose product blocks infection by a diff¬ 
erent group of phages, and two genes, lom and bor, which 
are thought to make the bacterial host a more effective 
pathogen of mammals. In the case of lom this increase in 
pathogenicity occurs by helping the lysogen bind to mam¬ 
malian cells and in the case of bor this occurs by making 
the lysogen more resistant to killing by serum (17, 358). 

Chromosome 

Figure 27-2 is a physical map of the X genome, showing 
the genes as boxes, the various DNA sites regulating tran¬ 
scription and other aspects of the phage’s life-cycles. The 
arrows indicate the directions and locations of the major 
transcripts. The orientation of the map, with head and 
tail genes on the left and the early genes (i.e., those expressed 
early in the lytic cycle) on the right, is the conventional 
representation. The region from the left end to attP, contain¬ 
ing the head and tail genes, is called the “left arm" and 
the region from attP to the right end, containing the early 
genes, is called the “right arm.” The genes are for the most 
part organized into large operons, grouped by function 
and by when their expression is needed. This organization 
facilitates control of gene expression by regulating a 
small number of promoters. 

The ends of this map correspond to the ends of the 
DNA molecule in the virion. When the phage injects its 
DNA into the cell upon infection, the DNA ends join to 
make a circle. The cohesive ends are regenerated toward the 
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Figure 27-2 The bacteriophage X chromosome. Features of the bacteriophage X chromosome are shown in a diagrammatic 
and somewhat simplified form. Genes are represented by rectangles (white, transcribed rightward; gray, transcribed 
leftward; vertical positions are varied for graphical clarity). Selected gene names are given on or above these rectangles. 
Asterisks (*) denote genes that are expressed in the lysogen and diamonds (♦) denote regulatory genes. A scale in kilobase 
pairs is shown below the genes, and below this the major characterized transcripts are indicated by black arrows. Important 
promoters are indicated by small arrows, and other important DNA sites are marked by vertical lines, just above the kilobase 
pair scale. The slanted arrow between the G and T reading frames indicates a programmed translational frameshift (see text). 
The region names b2, Ea, PPEL (promoter proximal early left), nin, PPL (promoter proximal late), and ERE (extreme right end) 
are discussed in the text. 









412 PART IV: INDIVIDUAL TAILED PHAGES 


end of the lytic cycle as DNA is packaged into new virions. 
On the other hand, when the phage enters the lysogenic 
cycle the circle is broken at a different site—the attachment 
site, or attP —during the integration of the DNA into the host 
chromosome (see figure 7-1 in chapter 7 for a cartoon 
of X insertion). The result is that the prophage sequence 
is a circular permutation of the sequence shown in 
figure 27-2, with the two ends of the prophage correspond¬ 
ing to the sequences flanking the two sides of attP. 

Overview of Molecular Mechanisms 

Lytic Cycle 

The lytic cycle can start either with prophage induction 
(described below) or with infection. The initial interaction 
between virion and host cell is carried out by the side 
tail fibers, which interact with an abundant component of 
the cell surface, probably the outer membrane porin OmpC 
(149). Subsequently the central (gpj) tail fiber binds to 
the LamB protein, also an outer membrane protein that is 
normally involved in maltose transport (and whose syn¬ 
thesis is subject to catabolite repression: 404). 

LamB binding triggers a presumed conformational 
change in the tail and release of the DNA from the virion 
into the cell. The DNA is preceded out of the virion by the 
tail-length tape measure protein, gpH, which occupies 
much of the lumen of the tail. Where in the cell the tape 
measure protein ends up, however, is not clear (307, 308). 
The details of how the DNA transits the two membranes 
and periplasm also is not understood, but the host PstM 
protein, an inner membrane protein and part of the man¬ 
nose import system, is involved (108, 260, 317, 383). Immedi¬ 
ately after arrival in the cytoplasm the cohesive ends of 
the X chromosome anneal and are joined together by the 
host DNA ligase (120). This yields a covalently conti¬ 
nuous double-stranded circular molecule. 

All X transcription is done by the host RNA polymerase. 
When the X DNA first enters the cell, the polymerase initi¬ 
ates transcription from only two promoters with relevance 
to the lytic cycle: P L and P R (figure 27-2). (Promoters for the 
lysogenic conversion genes are presumably also recognized 
but without any known consequences for the lytic cycle.) 
The polymerase encounters a terminator and stops after 
transcribing a single gene in both cases: gene N from P L 
and gene cro from P R . 

Nothing more would happen save for the action of 
gpN, which interacts with subsequent RNA polymerases 
initiating at P L and P R and renders them insensitive to 
termination signals. The N protein is, in other words, a tran¬ 
scription antiterminator (116, 129, 301, 319, 376). This is 
accomplished through an interaction among the RNA poly¬ 
merase, gpN, four host proteins (NusA, NusB, NusE, and 
NusG), and a specific sequence in the nascent P L and P R 
mRNAs called the nut (N utilization) site. This mechanism 


requires that the nut site be part of the transcript being 
synthesized by the polymerase, and consequently it only 
works for promoters with an associated nut site (only 
P L and P R in X). Because of the cis-acting nature of nut sites, 
some RNA polymerases are “antiterminated” (insensitive to 
terminators) at the same time that other RNA polymerases 
that started at different promoters are not. This difference 
is in fact the basis for an important part of the regulation 
of prophage integration and excision (described below). 

The most important effect of gpN antitermination is 
that transcription initiated at P L and P R can be extended 
across the remainder of the early genes. From P L , trans¬ 
cription extends past N through several additional genes 
(see figure 27-2). These genes include cIII, which has a role 
in the decision between lytic and lysogenic growth, and 
three genes, exo, bet, and gam, which promote homologous 
recombination during lytic growth (274, 332, 335). The 
products of the latter three genes have been studied exten¬ 
sively and have been shown to encode an exonuclease 
(212, 229), a strand annealing protein (190, 225), and an 
inhibitor of the host RecBC nuclease (238, 275), respectively. 

From P R the first gene to be transcribed after cro is ell, 
which plays a central role in the lytic/lysogenic decision 
(395). The ell gene is followed by genes 0 and P, whose 
products are essential for initiating phage DNA replication 
(118). The next 10 genes are not required for phage growth, 
although in some cases biochemical functions have been 
defined for them. 

The last gene in the early transcript from P R is gene Q. 
Gene Q protein is responsible for turning on high level 
expression of the late genes, which it does by allowing 
RNA polymerase at the late promoter P R ' to escape from 
a strong pause site (306). Like gpN, gpO interacts with a 
special site near the promoter to alter the RNA polymerase, 
but it differs from gpN in that gpO interacts with the site 
in the DNA and not with the corresponding site in the 
nascent RNA transcript, as is the case for gpN (303). The 
overall functional effect is the same in that RNA polymerase 
that initiates at P R ' and interacts with gpQ is insensitive to 
termination signals and continues to be insensitive through 
the entire 26.3 kbp of the late gene operon. 

The “early period” of X lytic growth ends as expression 
of the late genes begins, 10-12 minutes after infection. Of 
the many activities encoded by the early genes, the only 
ones that have essential roles in the lytic cycle in the 
laboratory are those encoded by N, cm, O, P, and Q. The Cro 
protein is a repressor which, like Cl repressor, binds to the 
operators 0 L and 0 R , each of which has three subsites, 
and each subsite binds a single repressor dimer. The Ol 
and 0 R operators control P L and P R , respectively (134, 281). 
However, the “rules” for how the two repressors bind to 
the operators (affinity, cooperativity, order of binding to 
subsites) are different, and the two repressors consequently 
have different quantitative effects on expression from P L 
and P R . The precise biological role of these operator subsite 
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differences remains unclear (231, 300). The Cro protein 
builds up during the early period of lytic growth and 
toward the end of that period causes a decrease, though 
not a complete shut-off, of transcription from P L and P R . 

Replication of phage DNA begins soon after the begin¬ 
ning of expression from P R , initiated by the action of gpO 
and gpP The 0 protein binds to the replication origin, 
which is a tandem series of four 19 bp near-exact repeats 
(called “iterons”) lying within the gene 0 coding region 
(101, 355). The bound 0 protein recruits the P protein 
to the origin which in turn recruits the host DnaB 
helicase (236). This complex is then partially disassembled 
by the host DnaK, DnaJ, and GrpE chaperones to release 
DnaB from the origin (these ubiquitous chaperones were 
discovered and named because of their participation in 
X DNA replication; 403). The remainder of the host repli¬ 
cation apparatus is then added and replication proceeds 
away from the origin (87, 118, 245, 270, 408, 409). The X 
0 protein is metabolically unstable, so that when P R 
transcription stops during establishment of lysogeny, 
initiation of DNA replication is also rapidly extinguished 
(127,410). 

During lytic growth, or prior to the establishment of 
lysogeny, replication initially proceeds bidirectionally from 
the origin to produce daughter circles from the initial circu¬ 
lar template chromosome. At 10-15 minutes into the 
lytic growth cycle, replication changes to the rolling circle 
mechanism, which produces the multi-genome linear con- 
catemers that are the substrate for packaging into the head. 
The mechanism of this switch to rolling circle replication 
remains unclear (15). 

The end of the early period and the beginning of the 
late period is marked by the dramatic increase in transcrip¬ 
tion of the late gene operon from P R ', and therefore by the 
beginning of gpQ activity. The well-studied X 0 gene lies at 
the end of the early rightward transcript and its product, 
the 0 protein, causes high levels of late gene transcription 
by facilitating release of RNA polymerase from a pause 
site just downstream of the late promoter, P R '. In addition, 
gpQ renders the RNA polymerase insensitive to downstream 
cis termination signals by contacting its a /0 subunit (237, 
259, 303, 306). Essentially all transcription of the late genes 
starts at P R ' and continues through the entire late gene 
operon. Because it takes RNA polymerase more than 
10 minutes to traverse the operon, the time when synthesis 
of the various late proteins begins depends on their position 
in the operon (294). As far as is known, these differences in 
onset of translation have no functional significance. The late 
operon genes encode the proteins involved in building 
progeny virions and in cell lysis. 

Virion Assembly: Overview 

Assembly of the structural components of the virion 
depends on ordered interactions among the structural 


proteins and not on their order of synthesis. The amount of 
messenger RNA synthesized is essentially the same for all of 
the late operon genes, and the stability of that messen- 
gerRNA is nearly the same whether measured by a physical 
or functional assay (294, 295). Despite this, the molar yield 
of protein from the different genes varies over a nearly 
1000-fold range. The translational efficiency for a given 
gene is determined by the region surrounding the begin¬ 
ning of the gene—presumably the translation initiation 
signals (311). The translational yields for the different 
genes appear to correlate not with the genes’ positions in 
the operon but with the amount of each protein that is 
needed for its biological function, in most cases virion 
assembly (57). When the various proteins required for 
virion assembly and concatemeric DNA accumulate to 
suitable levels, assembly of progeny virions occurs via a 
specific assembly pathway—a set of ordered protein-protein 
and protein-DNA interactions (122, 195). Like other 
tailed phages, X assembles heads and tails independently, 
and then these join to form functional virions (371, 372). 

Head Assembly 

Like most other large viruses, X assembles its head by 
first assembling a procapsid—an empty coat protein shell— 
and then actively pumping the DNA chromosome into that 
shell (184,185) (see chapter 6 for a general overview of the 
packaging of dsDNA in phage heads). X has 10 genes that 
are required for head assembly, and the products of six of 
them end up in the mature virion (table 27-1). Only a subset 
of the head genes, B, C, Nu3 and E, are needed for procapsid 
assembly. Twelve subunits of gpB make up the grommet¬ 
like portal structure. The hole in the center of this dodeca- 
meric ring is the “portal” through which DNA will enter the 
procapsid. The GroEL/S host chaperonin system is needed to 
produce functional portals, probably at the level of subunit 
folding (indeed, it was the study of this process in X that led 
to the initial discovery of this chaperone, and gave it its 
name; (9,123)). 

Once portals are made, they may act as a nucleus for 
procapsid assembly, although this has not been rigorously 
shown, and some procapsid-like particles assemble in the 
absence of gpB (reviewed by 122, 251). The details of the 
procapsid assembly pathway are not clearly understood, 
but aspects of some roles of individual proteins are known. 
The gene C protein has a particularly complex and interest¬ 
ing role. It is tentatively identified as the head maturat¬ 
ion protease because of its sequence similarity to the ClpP 
protease family and because of the protein composition of 
procapsids and other phage-related aberrant structures 
that are assembled when gpC is defective. Gene Nu3 encodes 
the putative scaffolding protein; scaffolding proteins aid 
the proper assembly of coat proteins by co-assembling 
with them in the interior of the procapsid, but they are 
removed from the structure before DNA is packaged. 
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Table 27-1 Virion Structure and Assembly Genes of Bacteriophage X 


Gene 

Size of 
protein 
(amino acids) 

In virion? 

Copies 

Function, special features 

Nu 7 

181 

N 


Small terminase subunit: DNA packaging 

A 

641 

N 


Large terminase subunit: DNA packaging 

W 

68 

Y 

~6 

Adaptor between portal and gpFII 

B 

533 

Y° 

12 

Portal 

C 

439 

Y b 

~10 

Protease 

Nu3 

131 

N 


Scaffolding protein 

D 

110 

Y 

405 

Major capsid decoration protein 

E 

341 

Y 

405 

Major capsid subunit 

FI 

132 

N 


Accessory role in DNA packaging 

Ell 

117 

Y 

~6 

Forms tail attachment site on head 

Z 

192 

? 


Head-tail assembly 

u 

131 

Y 

~6 

Tail shaft stabilization 

V 

246 

Y 

192 

Major tail subunit 

G 

140 

N 


Tail assembly chaperone 

T 

279 c 

N 


Extension, by translational frameshift, of gpG tail 
assembly chaperone 

H 

853 d 

Y 

~6 

Tail length tape measure protein 

M 

109 

Y? 


Tail tip assembly 

L 

232 

Y? 


Tail tip assembly 

K 

199 

N? 


Tail tip assembly 

1 

223 

N? 


Tail tip assembly 

J 

1132 

Y 

~3 

Tail tip assembly, central tail fiber 

stf 

774 

Y 

12 

Side tail fiber, main structural component 

tfa 

194 

Y 

12 

Side tail fiber, assembly factor and structural 
component 


“ Twenty-one amino acids cleaved from N-terminus of most subunits. 
b Processed into XI and X2. 
c Size given is for G-T frameshift product. 

d Approximately 100 amino acids removed during tail maturation. 


The Nu 3 gene is nested in-frame in the last one third 
of the C gene; this means that the C-terminal one third of 
the gpC sequence is identical to the gpNu3 sequence (323). 
The functional implications of this sequence relation¬ 
ship are not well understood, but because multiple scaffold¬ 
ing proteins typically occupy the coat shell interior by 
themselves, it is tempting to speculate that the C-terminal 
portion of gpC co-assembles with the gpNu3 molecules that 
make up the scaffold. (We could see a gpC-gpNu3 interac¬ 
tion either as a way of assuring that gpC gets included 
into the assembly or as a mechanism by which gpC, 
already assembled to the portal, could nucleate assembly 
of the scaffold.) 

Regardless of how gpC comes to be part of the structure, 
once the procapsid is assembled, each of the approxi¬ 
mately 10 copies of gpC becomes covalently bonded to one 
of the 415 copies of gpE that initially form the procapsid 
shell, and the gpC-gpE "fusion product” gets trimmed 
proteolytically to make two slightly different sized products 
named XI and X2 (147). (Numerology: It would take 420 
copies of gpE to make a complete T = 7 shell. In the actual 
prohead five of these are missing to make space for the 
portal, giving 415 gpEs. Of these, approximately 10 get 
converted to XI or X2 through reaction with gpC, leaving 


405 gpEs in the mature prohead and head.) The locations 
of XI and X2 in the procapsid structure are not known, but 
a plausible model suggests that they are located around 
the portal where they act as structural adaptors between 
the regular gpE lattice (that makes up the procapsid 
shell) and the portal (which interrupts that regular lattice: 
252). Other proteolytic events that are part of procapsid 
maturation include removal of 21 amino acids from the 
N-terminus of the portal protein, gpB (though curiously 
only from about two-thirds of the subunits) (144, 367), and 
fragmentation of the scaffolding protein, gpNu3, which 
is then lost from the structure (144,168, 293). 

Following assembly and proteolytic maturation, the 
procapsids are ready to package a phage chromosome. In 
addition to the procapsids, packaging requires the repli¬ 
cated DNA in the form of a multigenome concatemer and 
three head proteins that are not part of the mature virion, 
the products of genes Nul, A, and FI. The GpNul and gpA 
polypeptides are the subunits of a multitalented enzyme 
known as terminase (so named because it creates the 
termini of the virion chromosome). Terminase recognizes 
and binds to the cos site on the concatemeric DNA, makes a 
staggered cut of the DNA to produce the left cohesive end 
of the DNA that is about to be packaged, and then carries 
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the end of the DNA to the portal vertex of the procapsid, 
where the terminase docks with the portal to form the 
DNA packaging pump. This pump is driven by ATP hydro¬ 
lysis that is probably carried out at an active site on the 
gpA subunit (65, 94). Following packaging of a full comple¬ 
ment of DNA, the terminase recognizes and cuts at the 
downstream cos site and probably departs from the head 
bound to the DNA downstream from the cut, which is the 
beginning of the next chromosome that will be packaged. 

The role of gpFI in this process is an auxiliary one in 
that phage mutants missing a functional FI gene still 
produce infectious virions, albeit with about 1% of wild- 
type efficiency, and FI independent mutants are easily 
isolated (254). Genetic experiments suggest that gpFI inter¬ 
acts with the gpA subunit of terminase and with the major 
capsid subunit, gpE (64, 253). A plausible model is that gpFI 
is part of the DNA-terminase complex and facilitates dock¬ 
ing of that complex with the portal of a procapsid by medi¬ 
ating the initial contact between the complex and the 
procapsid. 

During DNA packaging, and in some way triggered by 
the packaging process, the procapsid shell undergoes 
a dramatic rearrangement of the gpE subunits. This rearran¬ 
gement produces a shell that is larger in diameter, has 
walls that are thinner in cross-section, is smoother sur¬ 
faced, and is more angular in overall appearance (hexagonal 
in outline). This rearrangement exposes new sites at 
the points of 3-fold symmetry in the gpE lattice, and these 
are binding sites for the second major capsid component, 
gpD, which joins the capsid as trimers and significantly 
strengthens the structure (88,122,171, 340). Gene D proteins 
binding stabilizes the capsid substantially and in fact is 
necessary to make the capsid strong enough to successfully 
contain a full genomes worth of DNA (340). Once pack¬ 
aging is complete, the final two head proteins, gpW and 
gpFII, add to the head in that order in about six copies 
each (46, 58, 267). GpFII forms the binding site for the tail, 
which will join spontaneously. 

Tail Assembly 

Tail assembly in X begins at the tip of the tail with gpj. 
The products of genes I, K, L, and M then act in that order 
to finish the tail tip (189, 195, 198). In a parallel reaction, 
the products of genes H, G, and G-T form a complex which 
then interacts with the tail tip and the major tail subunit, 
gp\( to produce a tail that is complete except for the addition 
of two proteins at the head proximal end (193, 194). These 
last two proteins are gpU, which adds about six subunits 
to cap and stabilize the tail shaft (199), and gpZ, whose 
mode of action is unknown but in the absence of which 
tails join to heads and the resulting particles are defective 
in DNA injection (56). 

The interaction of the products of genes H, G. and 
G-T, mentioned above, is a particularly interesting and 


central part of tail assembly. Gene H protein is an a-helical 
protein which is cleaved in an assembly-dependent fashion 
by an unidentified protease (146, 354) and which acts as 
a tape measure to determine the length of the tail shaft 
(192, 196). G and T are overlapping open reading frames 
that are expressed through a translational frameshift 
mechanism, similar to the expression of gag and pol in 
many retroviruses. This arrangement results in the produc¬ 
tion of a large amount of gpG and a small amount of gpG-T, 
the frameshift product that includes an N-terminal half 
with the sequence of gpG and a C-terminal half with the 
sequence encoded by the “T" reading frame (224). 

GpG and gpG-T act as assembly chaperones by coating 
newly synthesized tape measure protein, gpH, and holding 
it in an extended conformation. The C-terminal “T” domain 
of gpG-T binds soluble gpK the major tail shaft subunit, 
and most likely serves both to recruit gpV to the site of 
assembly and to induce it to change into its assembly- 
competent conformation. Thus activated, gpV is presumed 
to assemble around the tape measure protein to form the 
tail shaft, displacing and replacing the gpG and gpG-T 
chaperones as it does so (399). 

It is evident from this description how the tape mea¬ 
sure protein may determine tail length, namely by binding 
a standard (probably saturating) amount of the gpG and 
gpG-T assembly chaperones. Thus, if the major tail sub¬ 
unit gpV only assembles by interacting with and replacing 
the chaperones bound to the tape measure protein, the 
length of the tail will be determined by the length of 
the tape measure protein. The lengths of the tails of phages 
X, HK97, and HK022, as well as those of a number of 
other long-tailed phages, are roughly proportional to the 
sizes of their tape measure proteins (264). 

Cell Lysis 

Progeny virions are assembled soon after late proteins 
appear, and cell lysis happens abruptly some 30-35 minutes 
after translation of the lysis genes commences. Phage X 
has three genes and five proteins that are responsible for 
cell lysis at the end of the lytic cycle. These genes are situated 
at the beginning of the late gene operon, and two of them, as 
described below, are “double” genes in the sense that they 
each encode two distinct protein products. 

The most straightforward of the lysis functions, the 
cell wall hydrolase or endolysin, is encoded by the middle 
gene of the group—gene R in X. The X R protein is a 
transglycosylase that hydrolyzes a particular bond in the 
cell’s peptidoglycan—specifically a l,4-(3 linkage between 
N-acetyl-D-glucosamine and N-acetyl muramic acid. This 
digestion weakens the cell wall sufficiently to allow the 
osmotic pressure difference across the cell wall and other 
physical stresses to burst the cell (28,105). 

Lysis timing is determined largely by interactions 
between the two alternative products of the S gene, holin 



416 PART IV: INDIVIDUAL TAILED PHAGES 


and anti-holin (368, 406). Holins disrupt the inner 
membrane of the cell and, because the cell wall hydrolase is 
separated from its cell wall substrate by the inner 
membrane, the timing of holin action determines when the 
cell wall can be degraded and therefore the timing of cell 
lysis. The S gene encodes two translational starts that result 
in production of two membrane proteins that differ in length 
by only two amino acids but have opposite biological effects. 
The shorter protein causes efficient membrane disruption 
and the longer one inhibits the action of the first. It is 
the interaction of these two proteins as they accumulate 
throughout the lytic cycle that in a complex and not well 
understood way determines the timing of cell lysis (31,132, 
368, 406). For more detail on S-mediated X lysis, and phage 
lysis in general, see chapter 10. 

The third lysis gene, Rz in X, encodes an accessory 
lysis function in that it is required for lysis or plaque forma¬ 
tion only in the presence of high concentrations of divalent 
cations. It has been suggested that the Rz protein hydro¬ 
lyzes a peptide bond in the peptidoglycan structure, but 
there is in fact no direct evidence for this claim. Whatever 
its function, it also requires a second protein, called Rzl, 
that is encoded entirely within the Rz gene in a different 
reading frame but in the same orientation (348). Mutations 
have been constructed in X that selectively knock out one 
or the other of the Rz and Rzl open reading frames, and 
both mutations have the same defective phenotype, failure 
to lyse in the presence of high levels of divalent cations (407). 

Lysogenic Cycle 

The lysogenic cycle starts, like the lytic cycle, by infection 
of a cell by a X virion, and the first few events of the two 
cycles are the same: injection and circularization of the 
DNA, expression of the two immediate early genes, N and 
cro, and action of the N protein to allow expression of 
the rest of the early genes. Following these initial events 
a “decision” is made between lytic growth, as described 
above, and lysogenic growth, in which the eventual outcome 
is a viable cell with a repressed X prophage incorporated 
into the host chromosome. The two primary known determi¬ 
nants of which way the decision will go are the multiplicity 
of infection—that is, how many individual phages infect 
the cell simultaneously—and the physiological state of 
the cell. The molecular basis of the decision has been the 
subject of intense and prolonged study over the past 
50 years (96, 241, 281): we give here a somewhat abbreviated 
and simplified picture. 

The proximal effector of this decision is the concentrat¬ 
ion of CII protein in the cell (157, 231, 395). CII protein is a 
phage-encoded transcription factor that binds to TTGCNf, 
TTGC DNA sequences in the —35 regions of three phage X 
promoters and activates those promoters from sites that 
are not recognized by the polymerase into promoters that 


sponsor a high level of transcription (160, 164, 324, 345, 
395). The primary effect of CII protein is mediated through 
the P RE promoter, which, in the presence of high levels 
of CII protein, efficiently produces a transcript of the cl 
repressor gene. The resulting Cl repressor protein effectiv¬ 
ely shuts down the lytic cycle by binding to its operators, 
0 L and Or, and preventing transcription from P L and P R . 

The second promoter activated by CII protein is P T , which 
produces a transcript encoding the integrase (1). Thus when 
the phage DNA is heading for a repressed state because of 
high levels of CII protein, enough integrase is made to 
ensure that the repressed prophage will be successfully 
incorporated into the host chromosome. Also, since the 
xis gene is not expressed from the P t transcript and Xis is 
only required for excision (see below), the proper direction 
of the reaction is ensured, that is, toward lysogeny. 

CII protein also activates the P aQ promoter, which makes 
an antisense RNA repressor of the Q gene, which delays 
entry into the late period of the lytic cycle (163). Thus, high 
levels of CII protein push the phage into the lysogenic path¬ 
way. If CII protein levels are low then the phage defaults to 
the lytic cycle. 

The CII protein can be regarded as the phages environ¬ 
mental sensor through which it determines whether the 
lytic or lysogenic pathway will be more beneficial to the 
phages interests. The precise manner in which the host 
physiology affects X’s lytic/lysogeny decision remains some¬ 
what uncertain, but CII is a metabolically unstable protein, 
and its concentration in the cell is therefore determined by 
a balance between rates of synthesis and of degradation 
(68, 161). On the side of synthesis, the rate of production 
is higher when P R is highly expressed. When there are more 
DNA templates in the cell—that is, at higher multipli¬ 
city of infection—lysogeny is favored, perhaps because it 
takes more repressor to shut down more copies of P R and/or 
there are more ell gene copies in the cell to make a higher 
concentration of CII (e.g., 20). 

Stability of the ell portion of the P R messenger RNA is 
also regulated. P oop transcription, which is at least partly 
controlled by the host LexA SOS repressor, creates an anti- 
sense RNA to the ell portion of the P R transcript (213). 
The RNA duplex formed with the ell mRNA is inactivated 
by RNaselll cleavage (214). Stability of the P oop transcript, 
in turn, is affected by polyadenylation by the host PcnP 
protein (346, 390). Finally, the host protein IHF (integra¬ 
tion host factor) has been reported to affect ell gene trans¬ 
lation (234), and recently reported effects of guanosine 
tetraphosphate (ppGpp) and the host SeqA, ClpP/ClpX, and 
DnaA proteins on lysogenization frequency of X and/or 
on transcription from P R indicate that multiple signals 
may impinge on CII synthesis and thus on X's lysis/lysogeny 
decision (81,125,279,330,331). 

Proteolytic degradation of CII protein, which is also 
important in determining its intracellular concentration, 
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is accomplished primarily by the host HflB (also known 
as FstH) membrane-bound protease system (209, 325). At 
high cAMP concentrations the stability of CII protein is 
enhanced and lysogeny is favored; conversely with low 
cellular cAMP concentrations, as when the cells are grow¬ 
ing on glucose, lytic growth is favored, presumably because 
HflB activity is higher and CII protein degradation is 
faster (20, 131, 180, 309, 331) (this catabolite repression 
effect may be stronger in some lambdoid phages than 
others; 169, 285). Finally, host proteins HflA and HflD affect 
the levels of CII protein by their modulatory effects on 
the activity of HflB protease, opening the possibility that 
other, as yet unknown signals acting through these 
proteins might affect the lysis/lysogeny decision (202, 203). 
The phage-encoded CIII protein, which is made from the 
P L transcript, also has a critical role in this balance of 
activities; it inhibits (and is degraded by) the activity of 
HflB and so favors lysogeny (153, 211). 

The body of knowledge about the biochemical mechan¬ 
isms by which X lytic/lysogenic decisions are made, as 
sketched above, is extensive and varied. Yet it is clear that 
the complete story has not been told. A recent example is 
that it has been shown that Cl repressor can form an octa- 
mer that links 0 R to 0 L by a DNA looping mechanism. This 
mechanism appears to have a crucial role in turning off 
Prm (see below) and allowing for an effective transition into 
lytic growth upon prophage induction (86), and it may also 
have a role in the lytic/lysogenic switch. This and other 
recent findings about how this paradigmatic genetic 
switch functions are still being worked into our understand¬ 
ing of the process, and we can expect informative new 
aspects of the workings of the switch to continue to be 
revealed for some time to come. 

Integration and Its Regulation 

X’s well-studied prophage integration functions map near 
the center of the vegetative map. Integrase (Int), assisted by 
host factor, IHF, catalyzes a site-specific reciprocal recombi¬ 
nation reaction between attP, the phage attachment site 
located just downstream of the int gene, and attB, the bacter¬ 
ial attachment site situated between two genes in the E. coli 
chromosome (44, 79, 217, 375). The result is the insertion 
of the prophage DNA into the continuity of the bacterial 
chromosome, circularly permuted relative to the virion 
DNA, and flanked by two hybrid attachment sites. 

Excision, which is the macroscopic if not microscopic 
reversal of the integration reaction, requires the phage Xis 
and host Fis proteins in addition to Int and IHF (12, 69,103). 
The direction of the reaction is controlled during different 
parts of the phage life cycles by controlling the ratio of 
Int to Xis to drive the reaction in the direction appropriate 
to the biological situation. This is accomplished in part 
by differential stability of the two proteins (222, 374) and by 


activation of the P[ promoter by CII, but primarily by an 
elegant “retroregulation” system in which the phage senses 
whether it is integrated or not and regulates Int and Xis 
synthesis accordingly (97, 268). 

The phage senses its state of integration by whether 
the sib regulatory site, which lies across the attachment 
site from int in the nonintegrated genome, is down¬ 
stream from the int and xis genes or, as in the integrated 
prophage, is not. A key to this regulation lies in the fact 
that int can be transcribed from two promoters, P L and P : . 
Transcripts from these promoters differ in whether they 
include xis (P L does and P t does not), in whether the tran¬ 
scribing RNA polymerase has been antiterminated by the 
N protein (P L has, P[ has not), and consequently in what 
kind of RNA structures are formed if the polymerase 
encounters sib (P L forms a messenger RNA destabilizing 
structure, Pi forms a stable 3' end). The result of these 
differences is that transcripts from the two promoters 
have very different functional stabilities, but only if the 
phage is not integrated (that is, only if sib is included in 
the transcript). The overall effect is that expression from 
an induced, integrated prophage produces both Int and 
Xis, therefore allowing excision. Expression from a non¬ 
integrated chromosome during establishment of lysogeny 
produces only Int, favoring integration. 

Repressed State and Induction 

In the repressed prophage the only genes that are expres¬ 
sed are the cl gene encoding the Cl repressor and the few 
other lysogenic conversion genes. Because the ell gene is 
one of the genes that is firmly repressed in this condition, 
there is no CII protein to activate transcription of the cl 
gene from the P RE promoter. Instead, cl is transcribed from 
a different promoter, P RM , which is located just upstream 
from the cl coding region, overlapping the 0 R operator. 
Like P RE , expression of P RM requires a transcription acti¬ 
vator, but in this case the activator is not the CII protein 
but the Cl protein itself (134, 165). At the concentrations 
of repressor typically found in a lysogen, two of the three 
repressor-binding subsites of 0 R are occupied by Cl dimers, 
and in this configuration P RM is activated. However, if 
the concentration of Cl repressor rises sufficiently to cause 
occupation of the third, lower affinity subsite of 0 R , then 
RNA polymerase is denied access to P RM and further tran¬ 
scription stops. Transcription is thus inhibited until cell 
growth decreases repressor concentration to the point 
that the third 0 R sub site again becomes free and further 
cl transcription can commence. 

Induction of the prophage into the lytic cycle hap¬ 
pens when repression by the Cl repressor is lost. In a 
small fraction of the population (perhaps about one cell in 
10 6 per cell generation under typical laboratory growth 
conditions), induction occurs “spontaneously”—that is, 
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without obvious provocation. The mechanism(s) respon¬ 
sible for spontaneous induction is not known, but it 
may simply result from the stochastic loss of functional 
repressor molecules to below the level needed to keep P RM 
activated. 

Treatment of the culture with an appropriate dose of 
ultraviolet light or other DNA-damaging agents such as 
mitomycin C leads to simultaneous induction of essent¬ 
ially every cell that is lysogenic for X in the culture. Rela¬ 
tive to spontaneous induction, this situation is much 
better understood: DNA damage-mediated induction works 
through the cellular SOS response. Just as the activated 
RecA protein that is produced in response to DNA damage 
turns on expression of the SOS genes by causing proteo¬ 
lytic autocleavage of the LexA repressor, it also causes auto¬ 
cleavage of the Cl repressor of the phage (230, 305). The 
cleaved repressor is no longer able to dimerize and as 
a result it can no longer bind effectively to the operators. 
The lytic cycle thus commences. 

Comparative Lambdoid Phage Genomics 

What Is a Lambdoid Phage? 

It became clear early in the study of X that some of the 
E. coll phages that had been isolated independently, and 
which were also coming under study, seemed similar to 
X in their overall temperate life-style, in their genetic map, 
in their virion morphology, etc. These properties could 
be contrasted with the strikingly different properties of 
some other groups of phages, notably T4 and the six 
other “type” phages of E. coli, whose study was advocated 
most famously by Max Delbriick (42 these “type” phages 
are reviewed in this volume: T1 in chapter 17; T2, T4, and 
T6 in chapter 18; T3 and T7 in chapter 20; and T5 in 
chapter 19). The X-like, or lambdoid phages included 
prominently a collection of phages isolated in Paris and 
studied initially by the Pasteur group. Genetic hybrids 
between these and X were used to great effect in early studies 
aimed at understanding the nature of phage immunity (gene 
regulation and regulator specificity), host range, and other 
fundamental topics of phage biology (141,154). 

To the extent that the term “lambdoid” was ever 
defined formally, it included the idea that a lambdoid phage 
was capable of recombination with X itself to produce a 
functional hybrid phage, as was first shown with phage 
434 by Kaiser and Jacob (183). The layout of genes along 
the genetic map also turned out to be largely conserved 
within this group. Furthermore, it became evident fairly 
early that the lambdoid group was not confined to phages 
of E. coli, the most obvious example being the very well 
studied Salmonella enterica phage P22 (342) (see chapter 22 
for a review of phage P22 biology). 


The widespread perception of phage X as an exemplar 
of all temperate phages has sometimes led to usages of 
the term “lambdoid” that go far beyond the original 
intentions of those who coined the term, effectively 
rendering it meaningless. Thus, “lambdoid” has frequently 
been used as a synonym for “temperate,” and it has in 
other circumstances been used to indicate that a phage 
under discussion has the same virion morphology as X, 
particularly the long, noncontractile tail. Such usages 
remove any discriminatory descriptive power from the 
term. At the same time, advances in comparative genomics 
of the tailed phages in general, and specifically of X and 
its relatives (as described in more detail below), have led to 
a better understanding of the genetic structure of these 
phage populations. This has meant, paradoxically, that it 
has become increasingly difficult to find a biologically 
meaningful sharp division between the lambdoid phages 
and other tailed phages (219). 

Despite these difficulties, we have chosen for a more 
detailed discussion a group of four of the best studied 
lambdoid phages, whose complete genome sequences are 
known and which span much of the diversity in what might 
be considered a reasonable and current definition of the 
lambdoid phages. Our purpose is not to belabor the question 
of how to define a lambdoid phage, but rather to choose a 
representative selection of phages in the genetic neighbor¬ 
hood of X. Comparing these phages gives a much enriched 
understanding of the many functions that their genomes 
share and that are often accomplished in subtly, or 
sometimes in dramatically different ways by the different 
phages. In addition to X, our “comparison group” phages 
include E. coli phages HK97 and N15 (the latter is reviewed 
in chapter 28) and Salmonella phage P22 (181, 265, 291, 314) 
(chapter 29). We provide a detailed comparative map of 
these four phages’ genomes (figure. 27-SI), available on 
the associated website at www.thebacteriophages.org. 

In addition to these four, there are many other X-Iike 
phages that infect enteric bacteria that could just as 
well have been included in such a comparison. Phages 
HK022, HK620, Sf6, SfV) 933W VT2-Sa, 4>27, Fels-1, 
ST64T, <j)K02, ES18, Gifsy-1 and -2, for example, have 
been completely sequenced (6, 52, 61, 181, 242, 247, 269, 
296, 405, M. Pedulla, S. Casjens, and R. Hendrix, 
unpublished data) and specialized aspects of phages 21, 
82, 434, <f>80, PA-2, PY54 and numerous others have been 
studied. In addition, phages D3, <j)E125, and APSE-1 that 
infect nonenteric proteobacteria Pseudomonas aeruginosa, 
Burkholderia thailandensis and an endosymbiont of pea 
aphids (215, 359, 389), respectively, as well as largely intact 
prophages in several proteobacterial genome sequences, 
have apparent transcriptional cascades and genome organi¬ 
zations that are very similar to ^’s, though less overt 
sequence similarity. We will refer to phages outside of our 
comparison group when they provide informative examples. 
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Mosaic Relationships and Their 
Evolutionary Origin 

Lambdoid Phage Genome Mosaicism 

Before we discuss the differences and similarities among 
these phages, we will first indicate how those relation¬ 
ships are thought to have arisen. This we will do by sketch¬ 
ing our best current understanding of the mechanisms 
by which the tailed phages evolve, and how those mechan¬ 
isms have led to the genomes of contemporary phages 
(see chapter 4 for additional discussion of phage evolution). 
This topic was first investigated in the late 1960s by 
comparing lambdoid phage genome sequences through 
electron microscopic visualization of DNA heterodup¬ 
lexes assembled from pairs of lambdoid phages (159, 327). 
The striking observation from these studies was that the 
ability of the two DNA strands from two different phages 
to form a heteroduplex—a measure of nucleotide sequence 
similarity—varies in a patchwork fashion across the 
lengths of the genomes. This argues that the genomes 
are genetic mosaics, generated by non-homologous (or 
possibly site-specific) recombination in the ancestry of the 
phages. These putative sites of recombination appeared 
to lie preferentially at certain locations, and it was hypo¬ 
thesized—and largely confirmed when the genomic 
sequences became available—that these sites are located at 
gene boundaries. 

Such observations led to the “modular evolution” model 
(11, 33, 342), which states that the horizontal exchange of 
genetic modules (genes or groups of genes) is mediated by 
homologous recombination between genomes at specific 
sites located between modules, and that this mechanism 
generates genomes with novel combinations of genes and 
thereby novel lambdoid phages. Such a mechanism, as well 
as recombination between homologous genes, no doubt 
contributes to rapid shuffling of alternative modules among 
mosaically related phages, and in some phages nearly 
ubiquitous sequences—now called “boundary sequences”— 
do appear to be present in some intergenic locations (71). 
The modular evolution model, however, does not explain 
how new mosaic junctions (novel sequence joints) arise 
nor how new non-homologous genes enter the phage 
gene pool. 

The analysis that is made possible by complete DNA 
sequences of multiple lambdoid genomes allows a much 
more detailed and subtle view of probable evolutionary 
mechanisms than was possible from the heteroduplex data. 
The most significant change in our views is that it appears 
that much and probably most of the non-homologous 
recombination that produces new mosaic joints is not 
confined initially to the positions of module boundaries. 
Rather, it is likely that these recombination events occur 
quasi-randomly, both with respect to position along the 


genome and with respect to alignment of the recombin¬ 
ing genomes (181). The expected result is a melange of 
recombinant types, the great majority of which are not 
packagable into virions or are otherwise defective as 
phages and thus immediately lost to natural selection. The 
recombinants that do survive will be strongly biased 
toward those that do not disrupt important functions. 
This means that in most cases they occurred at a module 
boundary. 

In some cases, it appears that the recombination event 
giving rise to a surviving recombinant was slightly out of 
register, giving rise to a short quasi-duplication (143, 181). 
Such out-of-register events (both deletions and duplicat¬ 
ions) may be much more common than is apparent from 
observable sequences. For example, if a duplicated sequence 
does not provide a selective advantage it should be suscepti¬ 
ble to removal by subsequent deletion, giving rise to what 
would then appear to be the product of an in-register recom¬ 
bination event. 

In any event, non-homologous recombination between 
phages generates variation in genotypes in the populat¬ 
ion and therefore provides grist for the mill of natural 
selection. Such recombination events probably happen 
relatively infrequently, but given the astonishingly huge 
population sizes of tailed phages (estimated at ~l(r 1 
individuals globally (26, 378, 387); see also chapter 33) and 
their probable ancient origins, the total number of such 
events submitted to the scrutiny of natural selection to date 
is most likely astronomical. Another source of variation, of 
course, is point mutation. Mutations will gradually accu¬ 
mulate in diverging genes to the point where detectable 
sequence similarity is lost. Thus, even genes that encode 
proteins that have no amino acid sequence similarity can 
share a common ancestry (i.e., can be homologous). A third 
source of genome variation is homologous recombination, 
inasmuch as this will rapidly reassort through the phage 
population the novel sequences created by the first two 
means (above). The number of different extant lambdoid 
genomes is not known. However, the variety is thought 
to be extremely large, and in fact identical lambdoid 
phages have not been independently isolated from nature, 
even when multiple lambdoid phages were isolated from a 
relatively restricted location (159). 

Phage Morons 

The mechanisms discussed immediately above create 
new genomes by rearranging and modifying the sequences 
of existing genomes, but they do not introduce novel 
functions to the genome. A possible means of doing so has 
been identified in the form of small genetic elements— 
usually one or a small number of genes flanked by a pro¬ 
moter and a transcription terminator—that have been 
inserted in recent evolutionary time between two genes 
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that are adjacent in a comparison phage (150, 181). These 
elements, which have been termed “morons” (“units of more 
DNA”), in some cases express proteins of known function, 
and in these cases the genes are sometimes active from an 
otherwise repressed prophage and provide a function that 
appears to be beneficial to the host. The X late operon 
moron genes fit the definition of lysogenic conversion genes 
(17,358). 

It need not be true that all morons contain lysogenic 
conversion genes, and even such genes that appear to bene¬ 
fit the host may sometimes only do so when the cell lyses 
(364, 365). Thus, we postulate that although moronic DNA 
may be added randomly to phage genomes rather frequently 
on an evolutionary time scale, most such additions are lost 
because they provide no selective benefit to or overtly 
damage the phage. Of the others, some are retained because 
they directly benefit lytic growth of the phage and others 
because they benefit the phage indirectly by acting to give 
selective advantage to the lysogenic host—again the huge 
population size of these phages make such intrinsically 
improbable events palatable. This view of morons as general¬ 
ized units of addition to genomes has been extrapolated 
speculatively back in evolutionary time to suggest that, in 
principle, the entire phage genome could have been built by 
a stepwise addition of morons (150). 

At this point it is unclear what the biochemical mechan¬ 
ism is that causes insertion of a moron into a genome. 
Any novel DNA sequence, including a moron, can in princi¬ 
ple be inserted into a genome by “random" non-homologous 
recombination (DNA joining) events. Alternatively, morons 
could move by an unknown, more directed mechanism. 

The Bigger Picture of Phage Relationships 

The complex set of mosaic relationships among the lamb- 
doid phages described in the preceding paragraphs is 
embedded in an even larger population of phages that 
may not initially appear to have any relationship to X. This 
can be illustrated by comparing bacteriophage Mu to the 
four lambdoid phages in our comparison set. Mu has no 
easily detectable sequence similarity to any of these four 
lambdoid phages, at either the nucleotide or amino acid 
sequence level, and the overall organization of genes, DNA 
replication life-style, and transcription patterns are signi¬ 
ficantly different between Mu and the lambdoid phages. 
Furthermore Mu, unlike X, has a contractile tail. However, 
the recently sequenced Shigella phage SfV (6) and E. coli 
phage 4>P27 (296) show that the X-like and Mu-like groups 
of phages, which have previously been thought to be 
genetically distinct, are in fact part of a larger group of 
phages that have been exchanging genes with each other 
in the fairly recent evolutionary past 

The genes of phages SfV and ®P27 are arranged like 
those of a conventional lambdoid phage. The early genes of 
SfV for example, are related in a typically mosaic fashion 


to those of X, HK97, and P22, the “b2 region” (see below) has 
similar genes to those in the h2 region of P22, and the head 
genes are in the HK97 head-gene sequence family. The 
surprise is that the SfV and <DP27 tail genes make very 
good matches to those of phage Mu, and they in fact have 
contractile tails. Thus, this tail module appears to have been 
exchanged between phages that are members of other¬ 
wise quite distinct groups, and this in turn suggests the 
possibility that all lambdoid and Mu-like phages are 
partaking of the same pool of genes. The boundaries of the 
lambdoid phages thus are increasingly difficult to define. 
A curious taxonomic consequence of their different types 
of tail module is that different members of the lambdoid 
phage group would be or are formally classified into each 
of the three(!) families of tailed phages—the Siphoviridae (X), 
Podoviridae (P22) and Myoviridae (SfV)—since the current 
taxonomic scheme emphasizes tail morphology (2, 219) 
(see chapter 2 for a review of phage classification). 

Gene exchange across even larger expanses of phage 
sequence space can be detected as well (151); for example 
the X tail fiber assembly protein gene tfa (see below) 
appears to have been transferred to (or from) phage T4 so 
recently that the proteins from the two phages are still func¬ 
tionally interchangeable (121, 137). The nature of the 
sequence similarities that are detected among phages 
that infect phylogenetically very different hosts implies (we 
do not give the complete argument here) that gene exchange 
can and does occur across the entire population of tailed 
phages. An extreme hypothesis about the genetic structure 
of the global tailed phage population might be that it is 
a smoothly varying genetic continuum, and our apparent 
ability to subdivide it into biologically distinct groups is 
an artifact of sparse sampling of the population. We 
suspect that the truth lies between this extreme and the 
conventional view of distinct and different phage groups 
susceptible to simple classification. What the ultimate fate 
of the lambdoid phages will be, as a biologically meaning¬ 
ful taxonomic grouping of viruses, is not yet clear. 

Tour of the Lambdoid Phage Genomes 

This section is best read with figure 27-SI, found at 
www.thebacteriophages.org/chapters/0270.htm, available 
to the reader. 

Overall Genome Organization 

The lambdoid phages share a common genome organiza¬ 
tion, and it is within the context of this overall similarity of 
organization that the considerable differences among 
genomes are seen. The sizes of their genomes fall in a rather 
narrow range of approximately 40 to 60 kbp. This correlates 
with the observation that all lambdoid phages analyzed 
to date (X, P22, HK97, and Gifsy-2) have isometric capsids 
with a triangulation number (T) of 7, a size that neatly 
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packages genomes in this size range (49, 76, 88, 280, 384; 
}. Conway, personal communication). It remains an interest¬ 
ing question to what extent the evolution of these phages 
is constrained by their capsid geometry. 

The clustering and functional order of genes is strongly 
conserved as well (figures 27-2 and 27-SI). The linear form 
of the X map shown in the figures represents the DNA 
molecule in the virion, and it also corresponds to the 
experimentally determined genetic map. Starting from 
one end of the virion DNA (conventionally defined as 
the “left” end and corresponding to the end where DNA 
packaging commences) is the cluster of genes that 
specify the heads of the virions. These are arranged in 
a stereotyped order, transcribed toward the center of the 
genome (rightward on the standard map), and followed 
in turn by the tail genes and the tail fiber genes, also 
transcribed rightward. Together these structural genes 
take up approximately half of the genome, the left arm. 
The right arm of the genome is largely devoted to the early 
genes, including regulatory genes, genes encoding repli¬ 
cation and recombination functions, and an assortment of 
“nonessential" or “accessory" genes that sometimes have 
a quantitative effect on phage progeny yield in the labo¬ 
ratory, are useful in particular situations, or have no known 
function. Transcription of the early genes diverges from 
a point near the middle of the right arm near the prophage 
repressor (cl ) gene. Near the right end is the late promoter 
from which all the late genes are transcribed as a single 
operon. (Late transcription is perhaps most easily visualized 
by considering a circular version of the map, since the 
late operon extends across the ends of the virion DNA, 
which are joined by DNA ligase immediately following 
DNA injection. Such a circular map corresponds to the 
viral DNA following injection or prophage excision, and it is 
equivalent topologically to the multi-genome head-to-tail 
concatemers that predominate at the time of late transcrip¬ 
tion.) Near the beginning of the late operon are the lysis 
genes, followed, as transcription proceeds across the joined 
ends of the virion DNA, by the head, tail, and tail fiber genes. 
The sequences encoding the attachment site ( attP) for pro¬ 
phage integration and the associated site-specific recombi¬ 
nation functions are located in the center of the virion 
chromosome and, following integrative recombination, at 
the ends of the prophage. 

In comparing different lambdoid phage genomes, there 
are many examples of genes in corresponding positions 
that are homologous—that is, they carry out the same func¬ 
tion and appear to have common ancestry, as judged by 
sequence similarity. In any given comparison, some pairs 
of such genes are more similar in sequence than others, 
indicating different divergence times from their common 
ancestors (for example, the L genes of X and Gifsy-2 are 66% 
identical, while those of X and HK97 are only 27% identical, 
yet the X integrase gene is 100% and 23% (only over the 
N-terminal half) identical to its HK97 and Gifsy-2 homologs, 


respectively). Other genes in corresponding positions can 
be analogous but not homologous—they carry out the 
same function but they do not appear to have a common 
ancestry (i.e., they belong to different sequence families). 

There are also numerous genes that constitute insert¬ 
ions or substitutions in one genome relative to a different 
genome of the lambdoid group. One of the most dramatic 
differences of this sort occurs between phages X and N15. 
The head and tail genes and the genome ends of these 
two phages are very similar (averaging ~56% sequence iden¬ 
tity in encoded proteins), bespeaking an evolutionarily 
rather recent genetic exchange between these two linea¬ 
ges, but the right arms are dramatically different (291). 
Consistent with this divergence, there are significant differ¬ 
ences in genome organization between these phages; each 
has a number of functions that are apparently missing in 
the other, and for the few N15 right arm genes that 
are recognizable as probable homologs of X genes, the 
sequence similarity is weak. Thus, in phage N15 X-like head 
and tail genes are joined to a group of early genes whose 
evolutionarily recent genetic partners are largely outside 
the conventional lambdoid canon. (This is an example of 
why it has become more rather than less difficult to 
define what a lambdoid phage is.) 

Other big differences within our comparison group 
include the facts that phage P22 has a single gene encod¬ 
ing its tail apparatus rather than the 13 that X has, and 
that P22 has an “extra” immunity region between its 
head and tail genes (reviewed by 272, 342). These differen¬ 
ces can be regarded simply as a particularly dramatic analo¬ 
gous but non-homologous substitution in the first case or 
as an insertion into the context of a standard lambdoid 
genome in the second. 

Head Genes 

The order of individual genes that have the same funct¬ 
ion within the head gene cluster is strongly conserved, 
even in the face of mutational changes in the sequence 
that erase any detectable sequence similarity (e.g., 50, 
53, 57). This is true not only among the lambdoid phages, 
as many features of this conservation are also present 
across the gamut of the tailed phages investigated to date. 
Whether this order is so strongly conserved for functional 
or for historical reasons is not yet clear. 

Starting from the left chromosome end, the “standard” 
gene order is: small followed by large terminase subunit 
genes, portal protein gene, maturation protease gene, scaf¬ 
folding protein gene, major capsid protein gene, and genes 
for head completion proteins. The studied lambdoid phages 
conserve this order faithfully, but there are some interest¬ 
ing variations in the actual gene structures. In addition 
to these, there is often a small number of additional genes, 
scattered among those mentioned above, whose presence 
and sequences are less conserved. 
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In the generation of mosaic genomes by non-homolo- 
gous recombination, there is little evidence for survival of 
the products of recombination events within the portion 
of the head gene region that is responsible for assembly of 
the procapsid (portal, protease, scaffold, and coat genes). 
The likely explanation for this is that the proteins encoded 
by these genes interact so intimately that they cannot 
be successfully reassorted with parallel but divergent sets 
of genes that accomplish their interactions in even subtly 
different ways. On this basis, we would expect that once 
the sequences of two sets of head genes have drifted apart 
beyond a critical distance, their encoded proteins are no 
longer able to interact correctly, and so they will therefore 
continue to drift apart. 

In accord with this hypothesis, there are at least six 
distinct families of lambdoid procapsid assembly genes 
that are exemplified by phages X, P22, HK97, Gifsy-2, 933 W, 
and ES18 (100,181, 242, 269, 314, our unpublished results). 
Head assembly has been studied only in the first three of 
these six head types. These have no recognizable similar¬ 
ity in their coat protein primary sequences, and there 
are significant differences in the details of their assembly 
mechanisms (reviewed by 47, 122, 148). In our focus 
group of four lambdoid phages, three head gene types are 
represented by HK97, by P22, and by X and N15 together. 
The HK97-like family of these genes is quite diverse, and 
the similar heads of <j)P27 and SfV may constitute a sub¬ 
grouping within this family. 

In spite of their lack of sequence similarity, there is 
evidence that the genes involved in head assembly in these 
different phages may in fact be homologous. All four of 
our comparison phages have a small terminase-subunit 
gene followed by a large terminase-subunit gene, but 
the effects of the terminase proteins are different in paral¬ 
lel with the sequence differences: both X and N15 cut the 
DNA during packaging to make 12-base 5' single-stranded 
cohesive ends (290, 393), HK97 makes 10-nucleotide 3' 
single-stranded ends (181), and P22 cuts initially at a range 
of clustered sites (the “pac site” region) (60, 392) to produce 
a blunt end. Phage P22 subsequently cuts sequence non- 
specifically as the headful packaging process moves along 
the DNA concatemer. Thus, within this group, the N15 and 
X small terminase subunits are similar in sequence, and 
the P22 and HK97 subunits are unique. 

Also consistent with homology in head assembly 
genes, the large terminase subunits are, with the portal 
proteins (below), the most strongly conserved (in amino 
acid sequence) structural proteins across a wide range 
of phages (50, our unpublished observations). We hasten 
to point out that even at this “high” level of conservation, 
the P22 terminase and portal amino acid sequences, for 
example, cannot be directly aligned with their functional 
counterparts in the other members of our comparison 
group: but they are members of a transitive set in which 


each member can be aligned with some but not all other 
members. This conservation likely reflects their central 
roles in the DNA recognition/cleavage process and in the 
DNA packaging pump (94, 328). We imagine that the com¬ 
plex interactions among the parts of this machine and 
with the phage DNA impose severe constraints on what 
sequence changes can be tolerated. There is a considerable 
body of information about how the X terminase sub¬ 
units interact with each other and with sites on the DNA 
(65, 83, 136). The similarities and differences with the 
sequences from phage N15 reinforce and augment this 
information (291). 

Portal proteins form the hole through which DNA enters 
the procapsid (18, 210). They are part of the sensor that 
determines when the head is full of DNA and they may 
be active participants in the DNA translocase machine that 
drives DNA into the procapsid (62, 328). Also possibly relat¬ 
ing to portal function, there are reports for several phages 
of a gene, of which Bacillus phage SPP1 gene 7 is the best 
example, located immediately downstream from the portal 
gene and encoding a protein that both interacts with the 
portal during virion assembly and is essential for efficient 
production of infectious virions (89). There is an apparent 
homolog of this gene in the lambdoid phage ES18 (our 
unpublished results) and in X’s more distant relative, phage 
Mu (249). We can speculate that head assembly in most 
of the lambdoids has lost its need for this function, or 
that they never acquired this function. Alternatively 
and perhaps more likely, the function may be present in 
all the lambdoid phages but is most often incorporated 
into another protein such as the portal protein. 

Some, but not all tailed-phages’ coat, scaffolding and 
portal proteins (in addition to the proteases themselves) are 
proteolytically cleaved by phage-encoded proteases in an 
assembly-dependent manner. These head maturation 
proteases in the lambdoid phages fall into at least two 
completely distinct sequence families which apparently 
act quite differently. The X and N15 proteases are members 
of the ClpP family. About 10 copies of this putative X 
protease, gpC, assemble into the procapsid where they 
participate in an unusual reaction in which each copy of 
gpC becomes covalently fused to a copy of gpE, the major 
capsid subunit, and each joined molecule loses about 
two thirds of its mass to proteolysis (147, 148). Based on 
indirect evidence, it is thought that gpC is responsible for 
this proteolysis, for the removal of 21 amino acids from the 
N-terminus of the portal protein, gpB, and for the degrada¬ 
tion of the 100 or more copies copies of the scaffolding 
protein, gpNu3, found in the interior of the procapsid 
(144,168,251). 

Phage N15 assembly has not been studied, but the 
similarity of the genes involved indicates that it must be 
very comparable to X in this regard. In contrast, the phage 
HK97 protease is not recognizably part of any established 
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protease family, though it is a member of a large family of 
homologous proteins found in other tailed phages. It is 
assembled into the procapsid in approximately 50 copies 
and is responsible for degradation of both HK97’s putative 
equivalent of a scaffolding protein and of itself (90, 148). 
The only proteolytic processing known to occur in phage 
P22 head maturation is the essential cleavage of injection 
protein gp7 by the host OpdA protease (75), and there is 
correspondingly no protease gene at this position of its 
genome. 

Scaffolding proteins occupy the interior of the procap¬ 
sid, are required for proper procapsid assembly, and are 
removed before or during the DNA entry process. The 
phage P22 scaffolding protein is the prototype for this 
function, and it is encoded by a separate gene immedi¬ 
ately upstream from the major capsid subunit gene (206). 
X uses a different scheme for expressing its scaffolding 
protein in which the scaffolding protein is encoded in the 
last approximately one third of the protease gene by 
means of a strong internal in-frame translation start (323). 
This same arrangement for the scaffolding protein has 
also been found in the sequence-dissimilar phage Mu (249). 
The P22 and X scaffolding proteins have no recogniz¬ 
able sequence homology, and in P22 they exit the pro¬ 
capsid intact (to be recycled in new procapsid assemblies; 
59, 206) while in X they are proteolytically destroyed. In 
phage HK97 the scaffolding function is thought to be 
fulfilled by a 102 amino acid sequence at the N-terminus of 
each capsid protein, and it is this sequence that is cut into 
pieces by the protease (93, 148). Perhaps there was a gene 
fusion in the ancestry of HK97 that joined the scaffolding 
protein and capsid subunit genes; alternatively the P22 
genes may have been derived from an HK97-like ancestor 
by gene fission. 

Sequences of coat protein (also called major capsid 
protein or major head protein) are very diverse, but 
the HK97 coat protein can be aligned with more than half 
the tailed-phage capsid sequences in the databases, includ¬ 
ing those of a large number of phages that infect Gram¬ 
positive hosts. Interestingly, the alignment suggests that 
about half these phages have the scaffolding function 
incorporated as an N-terminal segment of the capsid pro¬ 
tein, as in HK97, and the rest appear to have a separate 
gene for the scaffolding protein just upstream from the 
capsid-protein gene, as in P22. Further experimental stud¬ 
ies are required to confirm this speculation. The P22 and 
A./N15 capsid proteins belong to sequence families that 
are not detectably related to HK97 or to each other. However, 
the x-ray structure of the HK97 coat protein shell and a 
high-resolution cryo-electron microscopy structure of the 
phage P22 coat protein shells argues that P22 coat pro¬ 
tein has the same unusual fold as the phage HK97 coat 
protein (178, 382). In addition, there are numerous simi¬ 
larities in the head structures and assembly mechanisms 


for all three groups which, if taken together, argue that all 
three groups of capsid proteins may share common 
ancestry and may have retained the same protein fold and 
biochemical/functional properties in the face of the great 
divergence of their amino acid sequences. 

The last two genes in the X head region, located between 
the coat protein subunit gene and the start of the tail 
gene region, are FI, which has an accessory role in DNA 
packaging (64, 254) and FIT, which encodes the last pro¬ 
tein to join the assembling head and determines the specifi¬ 
city of tail attachment (46, 239); again, phage N15 has 
genes homologous to those of phage X at this position. 
HK97 has two apparently unrelated genes in the corre¬ 
sponding positions that we imagine may serve the same 
functions, but this has not been tested. 

This region of the P22 genome is unrelated to that of 
X or HK97 and is more complex, with three genes that 
have essential roles in “head completion” (genes 4, 10. 26), 
a gene encoding an uncharacterized assembly factor (gene 
14), and three genes encoding proteins with essential 
roles in DNA injection and which are probably them¬ 
selves injected into the cell along with the DNA (genes 7, 16, 
20) (34, 172, 207, 342). The three P22 head completion 
genes may subsume the functions of X FI and FIT. For 
the three DNA injection genes, we speculate that 
these may carry out some of the functions surrounding 
infection that are accomplished by tail genes in a phage 
like X, despite the fact that these three genes of phage 
P22 are generally thought of as head genes. Recent 
evidence shows that one of the P22 head completion 
proteins, gp4, has cell wall hydrolase activity (248). This 
may be analogous to the lysozyme found in the tail of 
phage T4 (188) and some other phages, though not yet 
located in the phage X virion nor the virions of any of its 
close relatives. 

Finally, the X and N15 phages have two essential head 
genes which have no clearly corresponding genes in phages 
HK97 or P22; these are the head completion genes W 
and D. X W is a small gene lying between the genes 
encoding the large subunit of terminase and the portal. The 
encoded protein binds to the head—presumably to the 
portal—following DNA packaging and provides a binding 
site for the FII protein, which in turn provides the binding 
site for the tail (46, 58). A high-resolution structure of gpW 
has been determined (240). Some other lambdoid phages, 
for example SfV and <DP27, have a gene in the W position 
that encodes a small protein with a similarly high pi, but 
which has no sequence similarity to gpW (296). As suggested 
for the gene mentioned above that lies downstream from 
the portal gene in SPP1 and Mu but not X, W may provide 
a function not enjoyed by many other phages, or alterna¬ 
tively the gpW function may be present but subsumed 
into another protein in phages such as HK97 and P22 that 
lack an obvious W homolog. 
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The X D gene lies just upstream of E, and the D pro¬ 
tein binds to the surface of the procapsid to form trimers 
on the surface, with the consequence that there is one 
gpD next to each gpE subunit. The gene D protein can be 
used as a “protein display system” and its atomic structure 
is known (246, 339, 400). Proteins of this type that stabi¬ 
lize the head shell are often called “decoration proteins” 
since they decorate the surface of the head. Some other 
viruses use this same strategy to stabilize their capsids, 
notably phage T4 and Herpesvirus (29, 158, 310), while 
others, such as P22, do not and instead rely solely on 
the considerable stabilization that comes from the confor¬ 
mational rearrangement (and expansion) of the capsid 
shell that all the tailed phages share. Phage HK97 and 
a minority of other tailed phages have no decoration 
proteins, but finish capsid maturation and stabilize the 
shell by forming covalent bonds among all the subunits 
of the shell, binding them together with a combination of 
covalent and topological links into a unified structure 
known as chainmail (91, 148, 271). No other factors are 
required for this crosslinking: the coat protein itself 
catalyzes the reaction (92). Such crosslinking was discov¬ 
ered in HK97, but is now known to occur in a number of 
other phages (109,110,124,138, 244). 

We note that even when two phages such as X and 
N15 have a perfectly homologous set of head (or tail) genes, 
it does not mean that the encoded proteins can substitute 
for one another or that they function in precisely the same 
fashion. The relationship between the 10 homologous X 
and phage 21 head genes has been particularly well explo¬ 
red by Feiss and colleagues (179, 326, 333, 398), who found 
that the 21 terminase function absolutely requires parti¬ 
cipation of the host IHF protein whereas X terminase 
does not, and that among the 10 head gene products 
only one 21 protein, gpFLI, can fully substitute for the paral¬ 
lel X gene product in head assembly. This large fraction 
of failures to complement almost certainly reflects the inti¬ 
mate protein-protein interactions that take place during 
virion assembly. 

Tail Genes 

Among our four comparison lambdoid phages, the 
tails of P22 are very different from the others and the X, 
HK97, and N15 tails are quite similar to each other in 
morphology, genetic organization, and, over much of the 
tail gene region, in sequence. The latter three have long, 
noncontractile tails. The tails of phage P22, by contrast, 
consist of six trimers of a single protein, the tailspike 
protein, gp9, and morphologically the tail is very short and 
more like a floret of protein than a tail (126). The tailspike 
protein has been studied quite extensively as a model for 
protein folding (23, 27), and a crystal structure is known, 
showing it to be primarily a large |3-helix (336, 338 and 
references therein). The tailspike is also an enzyme, with 


endorhamnosidase activity, which allows it to degrade the 
external O-antigen polysaccharide of its Salmonella host, 
presumably as a means of gaining access to the cell surface 
for infection (175, 337) (though see chapter 29 for a counter¬ 
argument for the utility of endorhamnosidase activity). 
There are at least two other lambdoid tail paradigms that 
do not happen to appear in our four comparison phages: 
phage 933W has an intermediate-length tail about half 
the head diameter and tail genes are not recognizably 
similar to any known genes (269): and phages 4>P27 and 
SfV have contractile tails that are similar to phage Mu 
tails (discussed above) (6, 296). 

Phage X has 11 genes devoted to making the tail proper 
(that is, not including the side tail fibers), and HK97 and 
N15 have the same number. For phage N15 these are all 
homologs of the X genes based on shared sequence simi¬ 
larity. For phage HK97, the leftmost five tail genes are not 
obviously related to their X counterparts by sequence, but 
for three of these five (genes 12-14 ) functions have been 
established and match those of the X genes in the corre¬ 
sponding positions (genes V, G, and G-T). The six rightmost 
HK97 tail genes have similar sequences to X and N15 tail 
genes (X genes H, M, L, K, 1, and ]). 

The organization of the tail genes of these three phages 
is evidently the same, and it is likely, as with the head 
genes, that the functional order of tail genes is conserved in 
phages with long, noncontractile, X-style tails even in the 
absence of detectable sequence similarity. Because clearly 
defined and unique functions have not yet been ascer¬ 
tained for a number of the tail genes, it is not yet possible 
to test this assertion as clearly with the tail genes as with 
the head genes. However, genes corresponding function¬ 
ally to X genes V, G, G-T, H, and / can be identified in a 
wide range of phages with this tail morphology, and their 
relative order is universally conserved (399). Mechanisms 
of tail assembly for phages with this type of tail have 
been studied in detail only for X itself (192, 195, 196), but 
the similarity of most of the tail genes of phages such as 
HK97 and N15 to those of X make it highly likely that 
they assemble tails similarly. 

The roles of the virion proteins in the injection 
process is one of the more poorly understood aspects of 
lambdoid phage biology. Since their terminal tail structures 
are considerably smaller than the baseplates of the larger 
phages, such as T4, it has been much more difficult to 
obtain evidence of structural changes that likely occur 
during adsorption and injection. Nonetheless, it seems very 
likely that substantial “baseplate” rearrangements analo¬ 
gous to those that occur during T4 infection (80) do occur 
in all tailed phages, since the DNA has to be released 
from the virion only upon receptor binding. It is known 
that gpH, in X, is released from the tail core (307, 308) 
and has some role in successful entry of DNA into the cyto¬ 
plasm (98, 317). In phage P22 three proteins (gp7, gpl6 
and gp20, which assemble into the head rather than 



BACTERIOPHAGE X AND ITS GENETIC NEIGHBORHOOD 425 


the tail) similarly participate in DNA entry into the cell 
(166,167,172). 

Tail Fibers 

The / gene of phage X encodes the tail fiber that binds to 
the well-studied X receptor of E. coli, the LamB protein of 
the outer membrane. The J protein constitutes much of 
the mass of the conical tail tip, probably as a trimer, and 
a portion of it extends from the end of the tail tip to form 
the short fiber that is visible in electron micrographs 
(195, 200). This fiber is most likely the part of the tail that 
interacts with the LamB receptor, and if so it is probably 
made of the C-terminal portion of gpj since host- 
range mutants that affect the interaction with LamB map 
to the C-terminal region of gpj (39, 74, 250, 369, 377). 

Phages HK97 and N15 have homologs of X’s gpj that 
are presumed to carry out this same function. It is curious 
that even though X and HK97 both use the LamB recep¬ 
tor (85), the sequence similarity between their J proteins 
does not extend to their C-terminal parts, the portion of 
the protein thought to interact with the receptor. P22 is 
not known to have a fiber comparable to the J protein fiber, 
but the “initial contact” receptor-binding function of those 
fibers is incorporated into the trimeric tailspike protein, 
gp9 (173, 174). (The receptor is the cell-surface 0-antigen 
lipopolysaccharide.) It is interesting to note that the non- 
homologous receptor-binding proteins of X and P22 are 
both trimeric and both utilize the N-terminal parts of 
the proteins to bind the virion (337). 

Many phages have additional tail fibers of a different 
sort, of which the long, bent tail fibers of phage T4 are 
perhaps the best known. Its fibers are made up of homo- 
trimers of different proteins that constitute the body of 
each half fiber: two proteins form the “elbow” and two are 
required for assembly of the fiber (reviewed in 388). One 
of the assembly-promoting proteins, gp38, may (in phage 
T2) or may not (in phage T4) be retained as a component of 
the assembled fiber (302). The fiber proteins form com¬ 
plex sequence families that are characterized by extensive 
intragenic mosaicism (135, 312). 

As an offshoot of studies of the relationships among 
these tail fibers, it was discovered that phage X also has 
tail fibers of this sort, although in X the fiber is built from 
only one polypeptide—the Stf (side tail fiber) protein (149). 
These fibers had not been known previously because 
the variant of X that is used in most laboratories around 
the world, XPaPa, has a frameshift mutation in the struct¬ 
ural gene for the fibers. This mutant lacks the long 
“side tail fibers” that can be found on the version of X that 
comes directly from the original E. coli K-12 lysogen; as 
a consequence, it makes larger plaques under laboratory 
conditions (the reason it was picked in the early days of 
X genetics) but adsorbs to cells somewhat less efficiently 
(see chapter 5 for the explanation of why larger plaques 


are an expected consequence of less efficient phage 
adsorption). 

The X Stf protein has segments of close sequence simi¬ 
larity to the corresponding proteins of phages T4, P2, and 
T5. The X Tfa (tail fiber assembly) protein is a close 
relative of T4 gp38 (above) and the two are functionally 
interchangeable (121, 137). Phage HK97 has two genes (28 
and 29) in the position corresponding to X’s stf and tfa, 
and they show weak similarity at the amino acid level 
to the X proteins. The morphology of HK97’s tail fibers is 
not well defined. N15 also has three functionally unas¬ 
signed genes (genes 22, 23 and 25) in the tail fiber region 
of the genome, and their only sequence similarity is that 
of phage N15 gp25 to phage HK97 gp28; since both N15 
and HK97 have short brushy tail-tip fibers, perhaps 
these proteins form these fibers. 

Morons and Terminators 

Scattered throughout the genomes of these phages a few 
genes are present, usually with associated transcrip¬ 
tion signals, that we classify as morons (150,181). These are 
most obvious in the head and tail regions, and so we 
discuss them here. Morons in their typical form include 
a small coding region flanked by a promoter and a stem- 
loop (factor independent) terminator. Morons are most 
obvious when they interrupt a stereotyped series of 
genes such as the lambdoid tail genes, where they may 
be inserted between two tail genes that are adjacent in a 
comparison phage (for example HK97 genes 15, 20, 22, 
and 23; 181). Morons are thought to be recent additions 
to the genome rather than ancestral genes that have been 
lost from the comparison phage, because (i) they do not fit 
into the functional clustering observed for most other 
lambdoid phage genes, (ii) similar morons often lie in 
different locations in different phages, and (iii) in a 
number of cases the G-C content of the moron DNA differs 
from that of the flanking regions, suggestive of recent 
horizontal transfer. 

Where their functions are known, the morons in the 
late operons of the lambdoid phages all have roles that 
are not relevant to virion assembly. In regions of less highly 
conserved genes, such as in the early operons, it is more 
difficult to recognize recent additions, so genes cannot 
always be classified unequivocally as morons or as more 
fundamental parts of the phage genome; however the sieB 
(super infection exclusion) lysogenic conversion gene, 
which lies backwards in the early left operon of X and P22, 
is an excellent example of a possible moron in an early 
operon (283, 344). 

An economical if speculative explanation for the pat¬ 
terns of morons seen across the phage population is that 
many and perhaps most of the genes in the present-day 
genome initially entered the genome as morons, and, 
when they found themselves retained because of the 
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selective benefits they conferred, gradually mutated to 
become integrated into the regulatory circuitry of the phage 
and to have the sequence characteristics of the surrounding 
genes (150). An argument has been made for such an 
origin for the X D gene (150, 249). A similar scenario might 
be invoked for the imml region of P22 (315, 318), 
which is located just upstream from the tailspike gene 
and is absent or replaced by other (apparently non-virion 
assembly) genes in the otherwise quite similar genomes 
of the P22-like phages L, APSE-1, HK620, ST64T, and Sf6 
(61,71,139,247, 359). 

There are also numerous putative stem-loop trans¬ 
cription terminators scattered throughout the genomes of 
these phages. Some of these are associated with morons, 
and some, especially in the early control regions, can be 
understood as parts of the normal regulatory circuitry. 
Others fit neither of these cases in an obvious way. For 
example, apparent terminators immediately downstream 
of the coat protein gene are found in all four of our com¬ 
parison phages. These terminators are typically located 
between genes and oriented in the direction of local trans¬ 
cription, arguing against their being in the genome by 
chance. It has been proposed that a function of the termi¬ 
nators may be to prevent potentially harmful transcript¬ 
ion from the repressed prophage originating from morons 
or from elsewhere within or outside the prophage (181). 

For phages like X it is presumably not a problem to 
have terminators scattered throughout the lytic operons 
because of the transcription antitermination functions 
of the N and 0 proteins that are expressed during 
lytic growth (116, 129, 306, 376). In accord with this, termi¬ 
nators are not generally found within lytic operons of 
phages that are not known to practice antitermination. 
On the other hand, a group of temperate mycobacterio- 
phages that appear not to have such terminators within 
their operons have apparently found an alternative way 
to prevent transcription specifically from the repressed 
prophage: they have repressor binding sites (“stoperators") 
scattered across the genome that terminate transcription 
only when the repressor is bound (37). Such similarity 
of functional outcome achieved by different molecular 
mechanisms argues for the evolutionary importance of 
the function. 

The b2 Region 

A deletion mutant of X isolated in 1961 (201), X b2, has 
lost nearly 6 kbp of DNA from the genome (hence known 
as the b or b2 region) that starts at the attachment site 
( attP ) and extends leftward toward the tail fiber genes. 
Except for some irregularities in prophage integration due 
to the alteration to the attachment site and loss of the 
sib site, X b2 is unimpaired for growth in the laboratory. 
This region encodes three well-expressed proteins made 
from the early transcript starting at the P L promoter (140), 


one of which has been shown to have endonuclease acti¬ 
vity (22), but it has not yet been possible to establish any 
beneficial function these protiens perform for the phage. 
Examination of other lambdoid phages shows that they 
frequently have similar lengths of lytically nonessential 
DNA in this region (that is, between the tail fiber genes 
and the attachment site), but it most often has an unrelated 
sequence. The impression given is of a region that is swap¬ 
ping segments of DNA in and out on an evolutionarily 
rapid time scale. 

For phages such as X where the functional role of the 
central (b2) region is not apparent, one view is that the 
genes in this region do confer a selective benefit on 
that phage, but that the benefit may apply in an ecologi¬ 
cal situation that has not been replicated in laboratory 
studies. An alternative view is that the encoded proteins 
may not necessarily confer any benefit but that there is 
a benefit to having DNA of any sequence here to act as 
a “stuffer,” because (for example) this improves DNA pack¬ 
aging efficiency (340). The latter idea may apply for 
phages such as X in which the phage packages the DNA 
between two specific sequences on the concatameric DNA: 
it is less obvious that it has any relevance to phages 
such as P22 that package by a headful mechanism (54, 
177, 356). 

Turning to our other three comparison phages, P22 
has three genes in this region. In this case, although these 
genes have no apparent role in lytic growth, they do have 
clearly defined functions in host serotype conversion 
(O-antigen modification) and presumably benefit the phage 
indirectly by benefiting its lysogenic host (360). A number 
of Shigella flexneri phages carry similar lysogenic conversion 
genes that encode various surface polysaccharide modify¬ 
ing enzymes. Where these S. flexneri phages have been 
studied, they have included lambdoid phages in which the 
relevant genes lie in this position (6, 7, 72). 

HK97 has a rather short b2 region, encoding no pro¬ 
teins, but related to the X sequence in such a way that it 
could have been derived from a X-like b2 region by a deletion 
of 3278 bp. Perhaps we have caught HK97 at a point in its 
evolutionary history when it is “waiting” for an infusion of 
new DNA. Phage HK022, which has an identical coat 
protein to HK97 and nearly identical DNA packaging 
proteins, has a genome that is about 1 kbp bigger than 
HK97's, so it seems likely that HK97 could accommodate 
additional DNA. HK97 also has a 51 bp segment in its 
abbreviated b2 region that is derived from a non-coding 
portion of the type 1 Shiga toxin operon, suggesting that 
an ancestor of HK97 may have carried this toxin operon 
here (see below). 

For phage N15, simple comparisons to the other lamb¬ 
doid phages break down from this point rightward, as this 
is where phage N15’s gene organization and sequences 
strongly diverge from the other phages in the group. 
The genes found at this position are homologs of sopA 
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and sopB plasmid partition genes, which have a role in 
maintaining the linear plasmid that is the N15 prophage, 
and a homolog of bacterial umuD genes (291). A cleaved 
form of UmuD, UmuD', is a subunit of the bacterial PolV 
error-prone DNA polymerase; the phage gene differs from 
the bacterial one first in that it does not encode the 25 
amino acid N-terminal inhibitory peptide that gets cleaved 
off the bacterial UmuD protein during the SOS response, 
and second in that the other subunit of PolV UmuC, is 
not encoded in the phage genome. Phage PI (reviewed in 
chapter 24), another very different phage with a circular 
plasmid prophage, also has sopA, sopB, and umuD' genes, 
and the latter’s product has been shown to functionally 
replace the host umuD gene (243). 

Integration Functions 

The elegant and complex mechanisms by which phage X 
controls the expression of its integration functions are 
described above. The HK97 sequence in this region is nearly 
identical to A,’s, and accordingly HK97 integrates into the 
same attB site as X. Similarities between the two phages 
include attP, P L , Pi, and sib, so although it has not been 
tested experimentally, it seems certain that HK97 regulates 
Int (integrase) and Xis (excisionase) expression essentially 
the same way X does. In contrast, P22, like many phages, 
integrates its prophage not between bacterial genes but 
into a tRNA gene (tRNA thr ), thereby disrupting the tRNA 
gene but reconstituting it with the first 46 bp of the proph¬ 
age, which are identical to the 3' part of the tRNA gene 
(228) (other phages may similarly reconstitute protein¬ 
encoding genes when they integrate into them, for example 
phage 21 ’s integration into the icd gene of E. coir, 45). The 
P22 Int sequence is only weakly related to that of X, and 
their Xis sequences are not detectably similar (though, intri- 
guingly, a 51 bp segment of the P22 xis sequence, presum¬ 
ably not functional, is found a short distance upstream 
from the xis gene of HK97; 181). 

Given the apparently fine-tuned sophistication of the 
regulatory circuitry by which X controls Int and Xis expres¬ 
sion (above), it is perhaps surprising that P22 appears 
not to use this mechanism at all. P22 lacks a sib site and 
a P t promoter, and unlike X it has a rightward-oriented 
promoter, P aI , a short distance upstream from xis (396). 
The details of regulation are not as well understood as 
in the X case, but it is postulated that P aI produces an anti- 
sense RNA that regulates expression of Int or Xis from 
the P L transcript. In addition, the integrase gene of (j>80 
is inverted relative to X (223) and the integrase genes 
of HK620 and Sf6 are on the other side of the attP site 
and so are likely to be regulated by other, as yet unknown 
mechanisms (71, 72). 

The prophage form of phage N15 is a low-copy-number 
linear plasmid with covalently closed hairpin ends and thus 
not integrated into the host chromosome like the other 


prophages discussed here (291, 292) (for a review of phage 
N15’s unusual biology, see chapter 28). Accordingly, N15 
does not have an attachment site nor an integrase, but it 
does have features that can be considered analogous to 
them. When the linear N15 DNA enters the cell during 
infection, it circularizes, like X through its cohesive ends. 
If the phage is entering the lysogenic cycle, the phage- 
encoded protelomerase enzyme (Ptl) binds to the DNA at 
the telRL site, located adjacent to the ptl gene, and makes 
staggered cuts in the DNA. To this point, this description 
is similar to what happens with X integrase. However, in 
the next step Ptl joins the cut ends of the phage DNA not 
to bacterial DNA but to the ends of its own complemen¬ 
tary strands, to form a linear prophage molecule with hair¬ 
pin ends and a gene order that is a circular permutation of 
the gene order of virion DNA (84, 170, 287). The regulation 
of Ptl function is not understood, but the ptl gene and 
telRL site occupy locations that are perfectly parallel to X’s 
int and attP, respectively. 

Leftward Early Operon “Ea” Region 

In most lambdoid phages examined to date there is a 
region of a few kilobase pairs in the leftward early operon, 
upstream from the int and xis genes and downstream from 
the genes promoting homologous recombination, where it 
has been difficult to determine functions for the genes. 
Deletions have little or no negative effect on phage growth, 
and in some cases there are uncharacteristically large 
stretches of apparently unused DNA between genes (we call 
this the “Ea” region after the P22 and X genes in this 
region). Failure to find functions for these genes may simply 
mean that appropriate tests have not been devised; and 
indeed recent experiments implicate four small genes in 
this area of the X genome in modulation of the E. coli 
cell cycle (322). However, it may be the case for other genes 
in this region that they truly do not have a function and 
that their presence is being tolerated by the phage, at least 
in the short term, in the absence of any benefit to it. The 
reason for suggesting this is that in comparisons between 
phages, a number of the genes in this region appear to 
be made of fragments of other phages’ genes from this 
region (see below). The appearance is that, like the central 
( b2 ) region on the other side of attP discussed above, this 
part of the genome is recombinationally more active than 
most other regions of the genome. 

It is possible that non-homologous recombination 
events may be more frequent in the regions near attP 
because incorrect prophage excision events can give rise 
to functional phages that have either lost genes near attP 
or picked up bacterial genes near attB (the latter process 
is called “specialized transduction" and has been studied 
in some detail; 373). These remarks about the Ea region, 
of course, apply to the integrating members of the lamb¬ 
doid group. On the other hand, recombination events may 
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in fact not be more frequent in these “active” areas but rather 
recombination events here may be less likely than in other 
regions to put the resulting recombinant phage at a selective 
disadvantage and therefore cause the recombinant to be lost 
from the population. 

Ln the case of HK97 this region is reduced to a single 
gene (gene 37), but comparison with the largely similar 
HK022 suggests that HK97 gene 37 is a hybrid that could 
have been created from a fusion of the upstream and 
downstream halves of HK022 genes 37 and 32, respecti¬ 
vely, by the process of deleting parts of those two genes 
together with the four genes in between (143, 181). Two 
other examples of this type of relationship are the different 
lengths of homologs of X ea22 found in different phages, 
and the phage Sf6 gene 21, which is a fusion of two phage 
933Wgenes, L0065 and L0069 (61). 

Homologous Recombination Functions 

The phage-encoded genes responsible for homologous 
recombination have been studied extensively by genetic 
and biochemical means in both X and P22 (256 and refer¬ 
ences therein). Although these groups of genes carry out 
the same overall function and are located at the same 
place in the genomes, they are so different in sequence, 
protein structure, and protein function that it seems certain 
they are not homologous. In X, the early left operon genes 
exo (exonuclease), bet (strand annealing protein), and gam 
(anti-RecBCD protein) have been shown to cause homo¬ 
logous recombination (274, 332, 335), and their products 
have been studied in some detail (above). 

The recombination genes of phage P22, which are not 
obvious homologs of the X recombination genes (176), 
have also been studied in some detail, although their bio¬ 
chemical roles have not all been assigned. Here there are 
four recombination genes: arf, erf, abcl, and abc2 (119, 176, 
255, 273, 276). HK97, which in some nearby areas of the 
genome such as the regions around the int and cl genes 
is nearly identical to X, has putative recombination genes 
that are homologs of P22 recombination genes erf and 
abc2. A striking feature of the comparison between these 
HK97 and P22 genes is that the erf genes are highly 
similar for the N-terminal approximately 150 codons of 
the 201 codon gene, at which point the similarity becomes 
suddenly and dramatically less, arguing that there has 
been a non-homologous recombination event at this point 
within the coding region of the erf gene (181). 

Also consistent with a non-homologous recombination 
event, the phage Sf6 erf homolog is similar to the P 22 gene 
only in its C-terminal portion (Sf6 also carries a bacterial- 
type single-strand binding protein gene in place of a P22- 
Iike abcl gene) (61). Such events within coding regions are 
seen considerably less frequently than those occurring 
between genes; the explanation in this case appears to be 
that Erf is a two-domain protein, and these recombination 


events occurred at the domain boundary, presumably 
making a protein that is still fully functional (181, 278). In 
the case of phage N15, there is no evidence for homologous 
recombination genes, so if they exist they must belong to 
novel sequence families. 

There is at least one additional, different type of homo¬ 
logous recombination module that is found in some 
lambdoid phages but not in our comparison group. Phage 
Gifsy-2 has an exonuclease VIII gene ( recE) homolog that 
lies at this position (242), and it has been shown that the 
recE exonuclease and adjacent recTstrand-annealing protein 
genes (also at this position) from the E. coli K-12 defective 
lambdoid prophage Rac encode proteins that can replace 
host homologous recombination functions (113, 216, 385). 
Every lambdoid phage that has been studied in this 
regard carries some combination of exonuclease, strand¬ 
annealing protein, and RecBC modifier genes. Since 
Exo and Bet proteins have been shown to interact (282), 
it seems possible that these functions may not be able to 
be mixed and matched indiscriminately to give complete 
function. 

Promoter Proximal Early Left Operon 

Accessory Genes 

In addition to the early left genes discussed above, lamb¬ 
doid phages often carry additional nonessential genes bet¬ 
ween the homologous recombination gene cluster and the 
early left promoter. In X these are kil, ealO, and ral, which 
encode proteins that block host cell division, bind single- 
stranded DNA, and modulate type-I host restriction-modifi¬ 
cation systems, respectively (77, 233, 298). 

SieB, although it lies inside the early left operons 
between genes N and ral, is transcribed from the opposite 
strand (283, 284). It contains a nested in-frame gene, 
esc, whose product inhibits SieB action: Esc thus allows 
phage P22 to escape exclusion by the SieB protein. 
The sieB/esc genes fit some of the definitions of a moron 
(above), but the fact that the esc gene is regulated by 
an antisense RNA repressor (284) that is processed 
from the early left operon message suggests that if it is a 
late arrival, it has become sufficiently integrated into the 
X circuitry to take advantage of the P L transcript in its 
regulation. 

P22 has ral, sieB, and kil homologs in this position 
(its kil gene has a similar function but is not obviously homo¬ 
logous to that of E) along with three other genes whose 
function is not clear. HK97 has two genes of unknown 
function at this location. The sequences of other lamb¬ 
doid phages that have not been studied in the laboratory 
suggest that they often carry some combination of the 
above genes as well as other genes of unknown function. 
Of particular interest is the P22-like phage ASPE-1, a rather 
distantly related lambdoid phage, which carries a putative 
DNA polymerase gene in this position (359). 
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Control Functions 

Comparative studies with different lambdoid phages in 
the control region around the cl gene predate the advent 
of genome sequences. The earliest examples are uses of 
phages with hybrid immunity regions to decipher the 
nature of gene regulation by cl. Such studies have contin¬ 
ued to the present and we cite only a few examples. 

Studies of X Cl and Cro repressors have included high- 
resolution structure determinations and investigations into 
the detailed nature of sequence-specific recognition of DNA 
by the repressors (4, 8, 19, 21, 73); parallel studies with the 
phage 434 repressors (3, 5, 70, 386) have provided valuable 
insights into which features of this process are general and 
which are specific to each phage. The various lambdoid 
phage Cl proteins are now known to have many different 
operator binding specificities (immunities), but all lambdoid 
phages have the Cl/Cro “switch.” 

One of the foundations to our understanding about 
how Cl and Cro repressors interact with their operators 
and compete with each other for binding has been the 
fact that the three subsites of 0 R are slightly different in 
sequence and interact differentially with the two repressors. 
It was therefore a surprise to learn that the 0 R 1 and O r 3 
subsites of HK022 have identical sequences (261). It was 
shown subsequently that if these subsites in X were 
changed to be identical, the phage still behaves apparently 
normally (231), and this has led to a rethinking—still 
under way—of our understanding of how the details of 
this regulation works. 

In our comparison group, HK97 has the same immu¬ 
nity as X (181), P22 has a different immunity (277, 316), 
and N15 has a still different immunity (232, 291). P22 also 
has a second prophage control region, called I nun I (a moron 
in the late operon), that encodes an antirepressor whose 
synthesis, in turn, is controlled by two adjacent repressor 
genes (milt and arc ) and an antisense RNA regulator 
(226, 315, 361, 394). The biological role of this antirepres¬ 
sor, which inactivates the prophage repressor (341) is not 
known. Could it be a device for making induction 
irreversible, a modulator of the lysis/lysogeny decision, or a 
modular induction system? If it is the latter, then we do 
not know its induction signal. 

Phage N15 also encodes an antirepressor (288) in its 
equivalent of the X Ea region, whose expression is control¬ 
led by a not yet fully understood RNA processing system 
that appears to be very similar to that of phage P4 (111, 112) 
(for review of phage P4, see chapter 26). It is also of interest 
to note that, unlike the integrating lambdoid phages, N15’s 
repressor does not completely shut off the early left and 
early right operons (291); they are on at a (presumably) low 
level in the lysogen, probably because some of their genes 
are required for replication of the prophage plasmid. 

The integrating lambdoid phages all appear to encode 
a Cll-like protein at the same position as X, but they 


vary quite widely in sequence suggesting that they have 
different operator specificities. The only three CII proteins 
that have been studied—those from X, 21 and P22—all 
bind a similar TTGCNf,TTGC/T motif, yet cannot substitute 
for one another, suggesting additional not yet understood 
specificities (162, 397). The CII of HK97 is only about 34% 
identical to X CII, and its target specificity has not been 
studied (181). Again, N15 has no apparent CII homolog (291). 

Comparative studies with gene N have also been infor¬ 
mative. Neely and Friedman (258) showed that the N 
protein of phage H-19B has different requirements for 
host factors than does the X N protein, and these differ¬ 
ences correlate with differences in the nut site sequences. 
More surprisingly, it was discovered that phage HK022 
accomplishes antitermination of the early transcripts 
without an N protein, through a strictly RNA-based mech¬ 
anism (13). At the location in the HK022 genome where 
an N gene would be expected there is a gene called nun 
that encodes an N-like protein that nonetheless has no 
role in affecting HK022 transcription. Instead, Nun is 
made from the repressed prophage and acts as a defense 
against infection of the lysogenic cell by other phages, 
such as X (204, 208). In the presence of Nun, the X N 
protein is displaced by Nun from its usual antitermination 
role. Nun interacts with the same host factors and phage 
nut site as does X N protein, but instead of causing anti- 
termination it promotes termination, thereby aborting the 
incipient X infection. 

In our comparison group, P22 has an N-like gene, 24, 
that has analogous function but only minimal sequence 
similarity to X N and a different RNA nut site binding 
specificity (114). It is not known whether N15 has an 
N-like activity. It has no candidate gene in the equivalent 
position in the early left operon, but the first gene in 
the early right operon encodes a X O-like antitermina¬ 
tion protein (291). Regulation of the N15 early operons is 
not yet understood, and may be different from that of 
integrating lambdoid phages because of the need to leave 
the operons partially on in a lysogen (above). 

DNA Replication 

Lambda contributes two proteins to its lytic-cycle DNA 
replication: the products of genes 0 and P. The rest of 
the replication apparatus is host-encoded and recruited 
to the replication complex either directly or indirectly 
by the 0 and P proteins (270, 349). Phage P22 does not 
have a homolog of the X P protein, but it does encode at 
the same position in the genome a homolog of DnaB, the 
protein that is recruited to the replication complex by 
gpP in X (10, 379, 380). Thus, other than not requiring the 
host DnaB, the replication process appears to be the same 
as A,’s. 

Detectable similarity between X gpO and its P22 
homolog, gpl8, extends only over the N-terminal portion 
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of the protein, which is the part known to be responsible 
for binding the origin DNA in X. Similarity does not 
extend over the C-terminal portion, the part that binds 
the gpP (381). HK97 has replication genes that are rather 
close homolog of those in P22 (181). 

The other integrating lambdoid phages that have been 
characterized have one of these two arrangements: an 
origin binding protein gene with a nested origin sequence 
followed by either a X gpP-like protein that recruits the 
host DnaB or by a P22 gpl2-like DnaB homolog. However, 
not all of the known lambdoid replication strategies 
have been studied. For example, ®P27, Gifsy-1, and Gifsy-2 
carry a homolog of E. coli dnaC; in the first it lies between 0 
and P position genes (296) and in the last two in the P posi¬ 
tion (242, 391); DnaC is a“helicase loader” (82, 220) that has 
very little sequence similarity to X gpP or DnaB. Fels-1 
encodes a large protein here that has weak primase simi¬ 
larity (242), and SfV has a gene with no homologs of 
known function in the P position (6). These appear to 
be “new,” as yet unstudied, lambdoid replication initiation 
paradigms. 

In replication, as for many of the other early funct¬ 
ions, N15 is substantially different from the more conven¬ 
tional lambdoid phages. The one phage gene that has been 
implicated in replication, repA, is located to the left of the 
immunity region rather than to its right as in the other 
lambdoid phages (232, 286, 291). RepA is a large protein 
of 1324 amino acids. The repA gene is required both for 
replication of the linear plasmid prophage form of the chro¬ 
mosome and for lytic-cycle replication (289). It is sufficient 
to drive replication of a circular plasmid in the absence 
of any other phage genes, arguing that it is likely the only 
N15 gene with an essential role in phage replication and 
that the origin of replication is most likely within its 
coding region. Database searches suggest that RepA may be 
a multi-domain protein, with weak sequence matches to 
DNA primases near the N-terminus and to origin-binding 
helicases from animal viruses near the middle of the 
sequence (291). 

Because the N15 head genes are very similar to those 
of X it is presumed that a concatemer of the genome is 
produced for DNA packaging, and therefore that lytic 
replication may resemble that of X. For prophage N15 
replication, which has been investigated experimentally, it 
appears that the mechanism is that the linear, covalently 
closed hairpin-ended plasmid is replicated to produce 
a double-length, double-stranded, head-to-head circular 
dimer of the genome, which is resolved into two single¬ 
length linear plasmids by the action of protelomerase 
(48, 84, 287, 289). The nature of the switch between 
the very different lytic and prophage replication modes 
in N15 is completely unknown, but protelomerase must 
not function during lytic growth since cos cleavage com¬ 
bined with telRL cleavage would give rise to half-genome 
fragments of DNA. 


The “Nin” Region (the “Poster Child" 

for Genome Mosaicism) 

Between X’s DNA replication genes, 0 and P, and the late 
regulator gene, Q, lie 10 tightly packed genes which can be 
deleted with no apparent effect on phage growth under 
laboratory conditions. This region is usually called the 
“Nin region,” after the nin5 deletion that renders phage 
growth N-independent by removing transcription termina¬ 
tors that block gene Q transcription in the absence of gpN 
(78). Functions have been established for a few of the 
lambdoid Nin genes—protein phosphatase, homologous 
recombination proteins, Holliday junction resolvase, and 
escape from Rex exclusion—and DNA methylase and anti¬ 
repressor functions are predicted by homology, but in no 
case are their roles essential or clearly understood in the 
phages’ life cycles (235, 347, 352, 362). These genes often 
occur in parallel positions in different lambdoid phages, 
and in these cases the sequences of the homologous genes 
are typically very similar and the differences are biased 
toward changes that are silent in the encoded amino acid 
sequence (181: R. Hendrix, unpublished observation). This 
observation argues that their functions are under positive 
selection. 

Perhaps the most striking observation from compar¬ 
ison of the Nin region across phages is that while most lamb¬ 
doid phages have such a group of genes, varying in number 
from one (phages Fels-1 and <DP27) to a dozen genes (phages 
HK022 and Sf6), they are not always the same genes. 
Thus X, HK97, and P22 all have 10 Nin region genes, but X 
and HK97 share only five genes of the 10, X and P22 share 
seven, and HK97, and P22 share five. The overall relation¬ 
ships are similar to what one would expect if each phage 
had “picked” its set of Nin genes from a menu of perhaps 
30-50 genes (181). Because there is evidence (cited above) 
that the Nin region genes provide a selective benefit to 
the phage beyond simply occupying space in the genome, 
we postulate that these genes give the phage a particular 
constellation of tools for dealing with its environment, 
or, put another way, the particular set of genes that a 
phage has in this region contributes to its adaptation to a 
particular ecological niche. 

Phage N15, in keeping with its many differences from 
the rest of the comparison group in the right arm of its 
genome, does not have any homologs of the genes found 
in the Nin regions of the other lambdoid phages. It is 
not yet clear whether any of the uncharacterized genes of 
the N15 right arm might show a similar relationship to 
corresponding genes of other N15-like phages. 

The O Antitermination Function and 

Late Operon Expression 

The well-studied X 0 gene lies at the end of the early 
rightward transcript and its product, the 0 protein, causes 
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a high level of late gene transcription by facilitating release of 
RNA polymerase from a pause site just downstream of 
the late promoter, P R '. In addition, gpO renders the RNA 
polymerase insensitive to downstream cis termination 
signals by contacting its a 70 subunit (237, 259, 303, 306). 
All the lambdoid phages examined to date have an 
apparent Q homolog and, with the exception of phage 
N15 and its relative 4>K02, they are all located at the 
same place in the gene order, just upstream from the late 
promoters they regulate. 

The 0 proteins of phages 21, 82, and X have been 
studied and they form three sequence families and have 
three different target specificites (133, 401). The 0 proteins 
of HIC97 and P22 are both more than 96% identical to 
that of X and so almost certainly have the same specifici¬ 
ties (304). The 0 proteins of phages 21, 82, and X share 
only weak sequence homology, but all appear to function 
in the same manner (e.g., 237). The N15 Q homolog (gene 
40) overlaps the 3' end of the cro gene homolog. In the 
other phages in our comparison set the space between the 
cro and 0 genes is occupied by 13 or 14 other genes; 
whether this difference in genomic organization has bio¬ 
logical significance is unknown. 

The location of the late promoter in phage N15 is not 
known experimentally, but transcriptional timing data 
suggest it may lie in the space between genes 51 and 52 
(291), which is well removed from the Q homolog but 
roughly at the same position in the overall gene order as is 
P R ' in phage X and the others. This putative promoter is 
followed by a putative transcription terminator, so if the 
N15 Q protein acts similarly to the X 0 protein (also not 
tested experimentally), it could facilitate late transcrip¬ 
tion by allowing read-through of this terminator. This 
picture is clouded by the observation that the N15 0 gene 
is expressed from the prophage, as are a majority of the 
other early genes (291). 

The Promoter Proximal Portion of 

the Late Operon 

The region between the late operon promoter, P R ', and 
the lysis genes is only a few hundred base pairs in X, HK97, 
and P22, and in N15 it contains a putative DNA adenine 
methylase gene; however, in some related phages it is sub¬ 
stantially larger. Most notably, many lambdoid phages 
have been found to carry Shiga toxin genes in this 
location (191, 269, 296, 357). Shiga toxin is the most impor¬ 
tant toxin produced by enterohemorrhagic E. coli (130). 
In phages 933W and H-19B, for example, there are nearly 
5 kbp between the late promoter and the lysis genes, which 
contain the two Shiga toxin genes and several other 
genes of unknown function (257, 269). The Shiga toxin 
genes can be classified as morons (above), but it is not yet 
clear whether they are regulated as late operon genes, 
lysogenic conversion genes, or both (364, 365). 


In addition, several lambdoid phages carry what 
appear to be functional tRNA genes immediately down¬ 
stream of the late promoter. Phages Sf6, 21, 933W ®P27 
(as well as defective prophages, for example, 933N, 9330, 
933R, and 933U in E. coli strain EDL933) all carry two or 
three putative tRNA genes in this location (61, 266, 269, 
296). Sf6 and 21 carry asparagine and threonine tRNAs, 
while 933W and <DP27 carry isoleucine and arginine 
tRNAs. The role of such tRNAs is not clear, since the codon 
usage of the late operons do not show any very convincing 
overrepresentation of the codons recognized by these 
tRNAs (61, 269). 

Cell Lysis Genes 

The X genes S (holin), R (endolysin), and Rz (biochemical 
function unknown) lie between the late promoter and the 
head genes. Like phage X, HK97 also has a transglycosy- 
lase gene at the middle of its lysis cassette, but P22 and 
N15 have a true lysozyme gene at this position (370). 
Lysozymes—which are members of a sequence family that 
includes hen egg white lysozyme and the lysis enzymes of 
other phages including T4—hydrolyze the same bond as 
do transglycosylases but by a subtly different enzymatic 
mechanism. Transglycosylases and lysozymes are not obvi¬ 
ously homologous, but they are thought to be functionally 
interchangeable. Other non-lambdoid phages, including the 
well-studied phage T7 (chapter 20), have a third type of 
endolysin, an amidase, which hydrolyzes a different bond 
in the peptidoglycan to the same effect. 

No lambdoid phages have been identified that use an 
amidase, but X’s cousin, phage Mu, provides an instructive 
example. The lysis gene of Mu encodes a lysozyme, but 
FluMu, an apparently intact prophage in the genome of 
Haemophilus influenzae that has a high degree of organiza¬ 
tional and sequence similarity to Mu, carries a T7-like 
amidase gene in the corresponding position (249). The cell 
wall hydrolases thus provide a particularly clear example 
of genes that can participate in analogous but non- 
homologous substitutions. 

The holin genes of HK97 and P22 are very similar to X 
S (51, 181). Phage 21 encodes a member of a distantly 
related class of holins that has been studied in the laboraotry 
(16), and in N15 at this position there are similarly nested 
genes that are predicted to encode a protein which, although 
it has little overt similarity in sequence to known holin 
proteins, has a similar pattern of hydophobic and charged 
amino acids (291). Likewise, phages HK97 and P22 each 
have clear Rz homologs immediately downstream of 
the endolysin gene position (51,181). 

Similar experiments to those in X with genes Rz and 
Rzl (above) have not been done in the other lambdoid 
phages, but internal out-of-frame open reading frames 
starting, as in the X case, with N-terminal signal sequences 
can be identified in each case. That observation lends 
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support to the idea that Rzl-Iike embedded genes may 
have roles in cell lysis in those phages as well. N15 does not 
have an Rz homolog, but has a non-homologous gene 
in this position that also contains a potential signal 
sequence-containing internal out-of-frame gene (291). 

It is not entirely clear whether the different lysis 
gene products interact with each other in specific ways; 
however, the fact that the phage T4 lysozyme and T7 Rz 
can replace their quite different P22 homologs with¬ 
out detriment in the laboratory suggests that they may 
function largely independently (51, 299). It is also not 
clear that all types of lambdoid lysis modules have been 
studied; the N15-like Klebsiella phage 4>K02 causes the 
host cells to lyse yet it carries no homologs to any known 
lysis genes (52). 


Extreme Right End of the Genome 

At the extreme right end of the genome of lambdoid 
phages, past the lysis gene cluster, there is typically a 
few kilobase pairs of sequence that does not have a con¬ 
sistent organization from one phage to another. There 
are some genes in these regions with known or inferred 
functions, for example, the X bor gene (17), a putative 
methylase gene in N15 (291), or the rha gene in P22 which 
inhibits successful infection of an IHF defective host 
(152). HK97 has an apparently intact gene, gene 73, of 
unknown function in this region that has homologs at 
completely different locations in two other phages, D3 
and 933W. Much of the rest of this region in most of 
the lambdoid phages examined contains no clear genes 
or, at best, short open reading frames of unknown 
functionality. 

It is not clear at this point how to understand 
these regions. It may be that much of this sequence is 
non-functional in a genetic sense and is tolerated only 
because it is not expensive to maintain and it has not yet 
been deleted, or because it acts as a “stuffer” to make 
DNA packaging more efficient (see “The b2 Region" above). 
It is tantalizing to note that like the “Ea" and “b2” regions 
(above), this region lies adjacent to a site where DNA is 
broken during normal X growth; perhaps errors in COS 
cleavage contribute to the nature of this portion of the 
genome. Alternatively, it may be that it encodes beneficial 
functions that have not yet been recognized. A possible 
candidate for such a function would be small regulatory 
RNA molecules of the sort that have been found recently 
in bacterial genomes (128). 

Lysogenic Conversion Gene Diversity 

Many temperate phages carry genes that are expressed 
from the prophage and which affect the properties of the 


host bacterium (55, 334). As was mentioned above, X carries 
six such genes: cl, rexA, rexB (these three are expressed 
at least in part as an operon), sieB, lom, and bor, which lie 
at four locations in its genome. Cl, RexAB, and SieB are 
all able to block infection of the lysogen by other phages. 
Lom and Bor proteins, encoded by morons in the late 
operon, appear to make the lysogen more able to parasitize 
a mammalian host, but this has not been studied in 
great detail (17, 358). 

In addition to repressor immunity functions, our com¬ 
parison group phages each carry a different set of lysogenic 
conversion genes. HK97 has three late operon morons, 
genes 15, 20, and 22-23, that may be expressed from the 
prophage, and two genes, 48 and 49, that replace X's 
rexAB, genes (181). None of these have homology to genes of 
known function. N15 carries a late operon moron gene, cor, 
downstream of its tail tip fiber gene, that blocks phage 
(j)80 infection, and two putative DNA methylase genes on 
either side of the lysis gene cluster (291, 363). P22 carries, 
in addition to its two immunity regions, one of which is 
a clear late operon moron (above), two other types of phage 
superinfection exclusion systems, sieB (early left operon 
moron) (283) and sieA (late operon moron) (343), and 
three genes, gtrABC, in the “b2” region which encode 
enzymes for addition of glucose residues to the surface 
O-antigen polysaccharide and which are responsible 
for changing the antigenic properties of the host bacteria 
(360). 

Essentially all lambdoid phages that have been examined 
carry lysogenic conversion genes. Such genes, if they are 
novel, are not always easily recognizable, but morons 
(which are usually lysogenic conversion genes) are unam¬ 
biguously recognizable when they are inserted into a 
well-studied, stereotypical region such as the X late operon. 
In the various sequenced late operons that have explicit 
homology to X, such insertions are present in the follow¬ 
ing locations (we use the X gene names for positional 
reference and list some of the phages with morons at 
each position): Late promoter proximal region—tRNAs 
(phages 21, Sf6, 933W: 61, 133, 269), Shiga toxin (933W 
H-19B; 257, 269), putative DNA methylases (N15; 291), 
porin (PA-2; 30); within the lysis gene cluster between R 
and Rz —antirepressor (933W; 269); between the lysis and 
head genes—DNA methylase, rha, bor (N15, P22, X; 51, 363); 
between Z and U —novel gene (Gifsy-1; 242); between 
G-T and H —novel gene (HK97; 181); between M and L— 
superoxide dismutase, lom (Gifsy-2, Fels-1; 107, 242); 
between K and I—novel genes (HK022, HK97, and <j)K02; 
52, 181); between I and J —superoxide dismutase, novel 
genes (CP-933V HK022, HK097, Fels-1; 181, 242, 266); 
between / and stf — lom, cor (X, P-EibA, 933W N15, HK022; 
181, 269,291, 313). 

No obvious moron or lysogenic conversion genes 
have been found inserted within the head gene cluster. 
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It is not known whether this is a sampling artifact or 
whether this region for some reason cannot tolerate such 
insertions, but clearly they can be tolerated in a number of 
different locations. It is interesting to note that lom and 
superoxide dismutase genes are found at more than one 
location in different phages, suggesting that they have 
either entered the lambdoid phages more than once or 
moved since their original entry. Thus, in the overall gene 
arrangement of the lambdoid phages, lysogenic conver¬ 
sion genes have been known for some time in the late 
operon, b2 region, promoter proximal early left operon, 
and transcriptionally downstream of the prophage repres¬ 
sor gene. There is no obvious reason why other locations 
will not be found to harbor such genes, and in fact 
two Salmonella lambdoid phages, Gifsy-1 and Gifsy-2, 
have recently been found to harbor identical lysogenic 
conversion “inserts” inside the Nin region (242). This inser¬ 
tion contains a promoter that drives expression of two 
genes, a putative antirepressor and a homolog of bacterial 
dinl genes (41). 

Regulation of lysogenic conversion genes can be com¬ 
plex. We described sieB and antirepressor regulation 
briefly above, and another case in point is the rexAB 
genes of X, whose regulation (and mode of action) remain 
to be fully understood (99, 218, 329). The fact that lyso¬ 
genic conversion genes affect the host but are expressed 
from the prophage also raises interesting possibilities 
for interactions with host bacterial regulatory networks. 
This has not been studied in many instances (e.g., regulation 
of the expression of X’s lom and bor genes remains unstu¬ 
died); however, it is known that some are regulated 
by the host's SOS regulon repressor LexA (the Gifsy-1 and 
-2 Nin region insertion (41), and the umuD and dinl 
homologous genes of (j)K02 and probably N15 (52)) and 
that the Shiga toxin genes on H-91B are regulated by 
an iron-dependent host Fur regulator protein (364). Finally, 
the genes encoding the phage P22 O-antigen modify¬ 
ing gtrABC genes are subject to an as yet poorly under¬ 
stood phase variation mechanism (117). The study of the 
regulation of lysogenic conversion genes should be fertile 
ground for future research. 

Lambdoid Prophages in Nature 

Any discussion of lambdoid (or any temperate) phage 
evolution and diversity cannot ignore the ubiquitous and 
plentiful prophages in bacterial genomes in nature. Surveys 
of the presence of prophages in Proteobacteria isolates 
have usually shown that a majority are able to release 
tailed-phage-like particles, many of which are infectious 
and many of which appear to be ?i-like (262, 320, 321). 
Indeed the genome of the laboratory strain LT2 of Salmonella 
enterica serovar Typhimurium contains four functional 
prophages, three of which are clearly lambdoid (242). 


The eight currently published complete bacterial genome 
sequences from enteric y-Proteobacteria genera, includ¬ 
ing specis of Escherichia, Salmonella and Shigella, contain 
a large number of intact and defective prophages, at least 
50 of which appear to be lambdoid (50). In addition, the 
sequences of eight nonenteric proteobacterial genomes 
show that they harbor at least an additional dozen X-like 
entities. 

It has been reported that the environment contains 
three to ten tailed-phage particles for each bacterium 
(26, 378). Although this is not known to be the case specifi¬ 
cally for the lambdoid phages, it seems clear that a very 
significant fraction of the genes of this phage group on 
Earth actually reside in prophages. Clearly any analysis 
of lambdoid phage diversity, both from the standpoint of 
the data available and the actual situation in nature, 
should include these sequenced prophages. However, this 
plethora of prophage data has only rather recently become 
available and is yet to be fully analyzed. Therefore, we 
will not attempt to review it in detail here, but will only 
mention a few aspects of the importance of prophages in 
lambdoid phage evolution and diversity. 

It is important to realize that a majority of the “pro¬ 
phages” that are found in bacterial genome sequences 
are not complete, functional phage genomes. Many 
are defective prophages in various stages of mutational 
decay, having obviously suffered point mutations, inser¬ 
tions, and deletions. Some of their genes must therefore 
be nonfunctional. But this does not mean that such 
prophages do not harbor functional genes or parts of 
genes. For example, the E. coli K-12 defective lambdoid 
prophages, Rac and OIN, have been shown to carry func¬ 
tional homologous recombination genes and lysis genes, 
respectively (102,187). 

In Rac, it is interesting to note that the early right and 
late operons are largely deleted, and the genes that remain 
there are in a very advanced state of decay (32). Neverthe¬ 
less, the early left operon (and its control) appears to be 
largely intact and contains, at least, functional repressor, 
ral, and integrase genes as well as two homologous recom¬ 
bination genes (36, 104, 106, 187, 205). This disparity of 
intactness between the two parts of the prophage could 
be the result of recent “repair" of the early left operon either 
by recombination with another prophage in the same cell 
or by a “passing" infecting phage. Such repair could occur 
through simple recombinational replacement or by inte¬ 
gration of a second prophage to form tandem prophages 
followed by deletion. 

We mention this to point out that recombinational 
generation of diversity among the lambdoid phages can 
occur by exchange of genetic material between a pro¬ 
phage and infecting phage (in either direction) or between 
two prophages, as well as between two phages that 
happen to infect the same cell. The former has been 
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demonstrated in the laboratory for lambdoid phages in a 
number of situations (e.g., 102, 186), and comparison of 
the two quite closely related 0157 E. coli strains EDL933 
and Sakai shows that a number of inter-prophage recombi- 
national events have almost certainly taken place since 
their divergence (50, see also 297). Some prophages may 
have very long times in residence (e.g., it may have taken 
millions of years for the early right and late operons of 
the Rac prophage to reach their current, largely decrepit 
state), and nearly all enteric bacteria appear to carry 
lambdoid prophages, so the former two scenarios seem 
quite likely to contribute to the recombinational genera¬ 
tion of diversity in temperate phages, perhaps even more 
likely than two different but related virions simultaneously 
infecting the same cell. 

These many experimentally unstudied prophage 
sequences also serve to point out additional varieties of 
lambdoid phage functional modules. For example, Fels-1 
and Gifsy-2 (both largely unstudied functional phages, 
but sequenced as prophages in the S. enterica LT2 genome; 
242) have genes in their head locations (“analogs” of X C 
and E ) that are unrelated or only extremely distantly 
related to those of the experimentally studied lambdoid 
phages. The Gifsy-2/Fels-l maturation protease-coat gene, 
by virtue of weak domain similarities to other non-Iambdoid 
phages, appears to represent a new head assembly para¬ 
digm in which the head maturation protease and coat 
protein are translated as one large polypeptide rather 
than as separate proteins. In addition, as mentioned above, 
Fels-1 and Gifsy-1 have replication genes that are different 
from those of the well-studied lambdoid phages. 

This sort of observation need not be limited to pro¬ 
phages that are known to be fully functional. An analysis of 
254 identifiable prophages in 84 completely sequenced 
bacterial genomes (at the time of this writing) found 
only a handful of convincing examples of possibly non¬ 
phage genes having been moved into prophages during 
the prophage decay process (50). It is thus at least 
provisionally justified to consider prophage genes that do 
not have homologs in known phages as potential “new” 
phage-borne genes. Perhaps the best historical example 
of such a situation, also mentioned above, is the recE homo¬ 
logous recombination gene of E. coli K-12, which resides 
in the defective Rac prophage. This gene was unique 
for many years until homologs were discovered in the 
“homologous recombination module” locale of the genomes 
of phages Gifsy-1 and Gifsy-2. In addition, it has no known 
non-phage homologs, so the recE gene family, originally 
discovered as part of a defective prophage, is now clearly 
a “phage gene.” 

Finally, we mention here the now well-known fact 
that many of the pathogenicity functions of bacteria that 
cause human or veterinary diseases are encoded by 
genes carried on prophages, both lambdoid and otherwise 
(14, 35, 55, 67, 366) (and see Shiga toxin discussion above 


as well as the broader discussions of chapter 47). We will 
not attempt to enumerate these here but only point out 
that bacteria occupy many ecological niches besides those 
of human pathogeneic relevance. In these cases as well 
it seems inconceivable that prophages are not providing 
genetic functions to their hosts that better adapt those hosts 
to their ecological situation. 


Summary 

Clearly, even though phage X has been under study for 
over 50 years, there still remains much to be learned 
from studying it even further (see, for example, 115). The 
recent sequencing of a number of lambdoid phages that 
infect enteric bacteria and lambdoid prophages that 
reside in enteric bacterial genomes has made the field 
biologically much more rich and varied. The exact extent 
and nature of this variety remains to be fully described, 
and the discovery of bacterial “virulence” genes in lambdoid 
phage genomes has made the study of these phages 
more urgent due to their relevance to human health. One 
important question that remains is how the extensive 
diversity in lambdoid phages correlates with the hosts 
that particular phages infect. Are particular gene types 
more useful in some hosts than in others? Does sequence 
relatedness among phages with the lambdoid life-style 
and gene arrangement gradually dissipate as their hosts 
become more distantly related, or will we find abrupt 
changes in relatedness of phages that infect particular 
host groups? These and other questions regarding their 
evolution and diversity will only be answered as more 
sequence is determined and more genetic and biochemical 
studies are performed on additional members of the 
lambdoid phage group. 
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N15: The Linear Plasmid Prophage 

NIKOLAI V. RAVIN 


T he temperate bacteriophage N15 is unique among 
Escherichia coli phages in that its prophage is not inte¬ 
grated into the bacterial chromosome but instead is a linear 
plasmid molecule with covalently closed ends. Linear plas¬ 
mids are very unusual in the Enterobacteria and only two 
others have been reported: phiK02 in Klebsiella oxytoca 
(4,43) and pY54 in Yersinia enterocolitica (16). However, such 
replicons are common in the spirochete genus, Borrelia (1). 
Here I will summarize the most relevant work on phage N15 
with a special emphasis on the mechanism of replication, 
telomere formation, and control of lysogeny. 

Bacteriophage N15 was isolated by Victor Ravin in 1964 
and was initially studied in Moscow, Russia (13, 33-36, 38, 
39). N15 belongs to the lambdoid phage family, as suggested 
on the basis of cross-hybridization of their DNAs (39), and is 
similar to phage X with respect to the length of the genome, 
morphology of phage particles and plaques, burst size, 
latent period, and lysogenization frequency (34) (for more 
on lambdoid phages, see chapter 27). As was shown by 
Victor Ravin and coworkers in 1967-1970, an unusual 
feature of phage N15 is that its prophage locates extrachro- 
mosomally. Three lines of evidence led to this conclusion. 
First, it was observed that lysogenic bacteria lose the pro¬ 
phage at nonpermissive temperatures if they have tempera¬ 
ture-sensitive mutations in “early” genes (33, 38). Second, 
results of bacterial crosses show an absence of the prophage 
in the bacterial chromosome (35). Third, total DNA isolated 
from lysogenic cultures can be separated into bacterial and 
phage fractions by sucrose gradient centrifugation (39). 

The next important step in the investigation of N15 
biology was performed by Valentin Rybchin and colleagues 
(Saint Petersburg, Russia), who showed that the N15 pro¬ 
phage is a linear plasmid with covalently closed ends. Line¬ 
arity of the N15 prophage resulted from the physical 
mapping of phage and plasmid prophage DNAs, which 
showed (i) that both DNAs are linear molecules 46.3 kb 
long, and (ii) that the two maps are circularly permuted 
(45). The ends of the N15 prophage (telomeres), designated 
telL and telR, are covalently closed hairpins in which one 
strand simply turns around and becomes the other strand 


(22, 23, 45). Inference of the hairpin nature of the phage 
N15 ends results from two observations: (i) terminal, but 
not internal, restriction fragments of N15 plasmid DNA 
renature rapidly after heat denaturation and quick cooling: 
(ii) treatment of plasmid DNA prior to restriction enzyme 
digestion with the SI nuclease, which is known to cut 
single-stranded DNA, abolishes the rapid renaturation 
effect (45). Mature N15 phage DNA, like phage X DNA, 
has 12 bp single-stranded cohesive ends (26), named cosL 
and cosR. 

The above data suggest the following mechanism of 
conversion of phage DNA to the prophage plasmid (22). 
After infection of an E. coli cell, the phage DNA becomes 
circularized via its cohesive termini. A special phage- 
encoded enzyme, protelomerase (prokaryotic telomerase), 
then introduces a staggered nick in the telRL region which 
contains a large palindromic sequence. Annealing of self¬ 
complementary single-stranded ends and formation of phos- 
phodiester bonds results in creation of hairpin structures at 
each end (figure 28-1). 


Overall Organization of the Genome 

The nucleotide sequence of the N15 genome has been 
recently determined (37). The genome contains 46,363 bp, of 
which about one half are similar to the bacteriophage X 
sequence (figure 28-2). This sequence similarity maps pre¬ 
dominantly to the left arm of the phage genome (figure 28- 
2), which contains the structural genes for the proteins 
required for virion head and tail assembly. Genes 1 through 
21 from phage N15 display a one-to-one correspondence to 
the phage X genes A through /. There is as much as 90% 
identity between the amino acid sequences of the phage 
N15 and phage X head gene products. Some parts of this 
region of phage N15 are more closely related to other, non-7, 
lambdoid phages. From gene 17 (the X tail-assembly gene M 
analog) to gene 25, except for gene 24, phage N15 
matches lambdoid phages HK97 and HK022 better than X. 


448 
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The above observation that phage N15 carries ^-like head- 
and tail-protein genes correlates well with the observation 
that N15 virion morphology is similar to that of lambdoid 
phages (37). 

The N15 gene 24 is the homolog and functional analog of 
the cor gene of phage cp80 (49) and is responsible for inability 
of N15 lysogens to adsorb bacteriophages N15, Tl, and (p80 
(36). To the right of the just-described block of morphogenic 
genes there is gene 26, a homolog of the E.coli umuD gene, 
which is involved in error-prone DNA damage repair. The 
next two genes in the left half of the genome are homolog of 
the sop A and sopB genes of the F plasmid and determine the 
segregation stability of the N15 prophage (see below). 

The division between the left and right arms of the N15 
genome is determined by the site (telRL) at which phage 
DNA is cut by protelomerase to make the linear plasmid 
prophage (figure 28-1). Contrary to the left arm, only 10 of 
the 35 N15 right-arm genes have homologs in lambdoid 
phages. Among these lambdoid-like genes are genes 38, 39 
and 40, which are homolog to genes cB, cro and Q, respec¬ 
tively and are responsible for control of lysogeny (see 
below). Genes 53, 54, 55, and 55.1 are thought to encode 
lysis functions and also have homologs in the lambdoid 
phage family (37). The three operons located in the right 
arm that are specific to phage N15 reflect its unusual life¬ 
style: the protelomerase gene (29), which is located right- 
ward of telRL, the antirepressor operon including genes 
30-32; and the replication region comprising genes 33-37, 
which are supposed to be cotranscribed from the promoter 
controlled by the CB repressor. Detailed description of other 
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Figure 28-2 Map of N15 virion chromosome. The N15 linear virion chromosome is shown with a scale in kilo base pairs. 
Rectangles immediately above and below the scale represent predicted genes that, respectively, are transcribed rightward 
and leftward; their colors indicate similarity to known genes in the following way: genes that have been found in lambdoid 
phages (gray); genes that have been found in plasmids and non-lambdoid phages (black); no database match (white). 

The N15 gene names are given within or near the rectangles and alternate descriptive names are indicated above or below. 
Strongly predicted promoters (arrows in direction of transcription) and transcription terminators (T) are also indicated. 
Asterisks (*) mark the position of the centromere sites involved in plasmid partition, and small filled circles mark putative 
CB repressor binding sites. The lines at the bottom indicate the minimal seguences that are known to drive linear and 
circular plasmid replication in E. coli. 
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Figure 28-1 Phage N15 chromosome linearization. 

A: Mechanism of conversion of phage DNA into linear 
plasmid. cosL, cosR, single-stranded cohesive ends; cosRL, 
represents the cos site-after annealing and ligation of 
cohesive ends; telRL, uncut target site of protelomerase; telL 
and teIR, left and right hairpin ends of the prophage created 
by protelomerase. B: Sequences of telRL site and hairpin ends 
of the prophage. The central 22 bp ideal palindrome telO 
is underlined. 
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right arm genes and discussion of their possible functions 
can be found in (37). 

Lysogeny Control 

The extrachomosomal location of the N15 prophage 
apparently requires controlled expression of not only the 
repressor protein but also the genes responsible for prophage 
maintenance. In fact, analysis of N15 transcription patterns 
shows that about half the N15 genes are transcribed in the 
lysogen (37). This situation is radically different from that 
of X lysogens (chapters 7 and 8) and suggests the possibility 
that phage N15 displays more complex regulatory mecha¬ 
nisms than phage X. In order to identify genes involved in 
the control of lysogeny, lysogeny-defective mutants that 
form clear plaques were isolated. The mutations responsible 
for this inability to display lysogeny have been mapped 
to three distinct loci: immA, immB, and immC (44). Two 
of them — immA and immB, the phage N15 secondary 
and primary immunity regions, respectively — have been 
characterized. 


Primary Immunity Region (immB) 

Prophage superinfection immunity is encoded at immB, 
the primary immunity region, which was characterized by 
Lobocka et al. (21) and found to be structurally and function¬ 
ally similar to the immunity region of lambdoid phages. 
immB contain three genes (figure 28-3): gene 38 (cB), gene 
39 (cro ), and gene 40 (Q). Gene 38 (cB ) encodes a repressor 
protein which is similar and homologous to the phage X cl 
gene. Clear-plaque mutants, mapping at immB, were found 
in the cB gene, supporting its role as a primary repressor. 
Gene 39 (cro) shows weak homology to cro genes of phages 
P22 and HK022 and occupies a position analogous to and 
similar in size to cro. The third gene, 40 ( Q ), encodes a protein 
that is similar to the transcription antitermination factor 0 
of phage (p82. 

The cB gene is flanked by a complex array of divergent 
operator-promoter sites (figure 28-3). Lobocka et al. (21) 
identified operator sites and showed binding of the N15 CB 
protein to these sites in vitro, but positions of associated 
promoters are known exclusively from sequence analysis 


and require experimental verification. The two operators 
leftward of the cB gene overlap the predicted promoter of 
the N15 repA gene, implying that binding of the CB protein 
at these operators represses transcription of repA. This 
supposition is further supported by an observation that 
phage N15-based miniplasmids lacking the cB gene have 
a higher copy number than similar plasmids with an intact 
cB gene (29). The three operators rightward from cB overlap 
the predicted promoter of the cB gene as well as the pre¬ 
dicted promoters of the “late” operon containing cro and Q. 
It was proposed (21) that the CB protein, by binding to these 
operators, represses both its own transcription and the 
transcription of cro and Q. 

In addition to CB, two other factors were suggested to 
regulate the expression of repA (21), although these hypoth¬ 
eses have not been verified experimentally. The leader 
region of repA contains a sequence typical of strong rho- 
independent terminators, suggesting the involvement of 
termination and antitermination in the regulation of repA. 
Also in the leader region there is a putative counter-oriented 
promoter, P ilIC , that is followed within approximately 80 bp 
by a strong terminator. This promoter could initiate tran¬ 
scription of a short RNA that is antisense to the leader 
sequence of repA and this short RNA may thereby modu¬ 
late transcription of repA. Modulation of a replication gene’s 
expression by antisense RNA is a common strategy employed 
by other plasmids and, particularly, by bacteriophage PI, 
where it regulates transcription of the lytic replicon (15) 
(chapter 24). 


Secondary Immunity Region (immA) 

Ravin et al. (32) have characterized the immA region 
(figure 28-4) and found that three open reading frames 
encode an inhibitor of cell division (coded by gene icd), an 
antirepressor protein (coded by gene ant A), and a gene that 
may play an ancillary role in antirepression ( antB ). These 
genes may be transcribed from two promoters. The upstream 
promoter, Pa, could be repressed by the CB repressor, 
whereas the weaker downstream promoter, Pb, is constitu¬ 
tive. Full repression of the antirepressor operon is achieved 
by premature transcription termination at T1 and T2 that is 
elicited by a small RNA (CA RNA) produced by processing of 
the leader transcript of the operon — this mechanism is 
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Figure 28-4 Organization of the immA region. A: Map and transcription map of the immA locus (32). Coordinates on the kb 
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indicated by arrows. The CA RNA is represented by a closed bar. B: Predicted secondary structure of N15 CA RNA. 


similar to the one used in the anti-immunity system of phage 
PI and the lysogeny control region of phage P4 (6, 7) (chap¬ 
ters 24 and 26). The CA RNA thus acts as a secondary repres¬ 
sor and clear-plaque mutants mapped at immA were found 
within the cA sequence. The CA RNA is a small, stable RNA 
molecule with a peculiar secondary structure: a double- 
stranded stalk, two stem-loops, and an 8 nucleotide single- 
stranded bulge (figure 28-4). The CA RNA appears to act 
as pseudo-antisense RNA since complementarity between 
specific sequences in CA RNA (specifically in the main 
loop and bulge) and corresponding sites on the untrans¬ 
lated leader transcript are required for efficient transcri¬ 
ption termination. 

The antirepressor functions encoded in the immA 
secondary immunity region determine the lysis-lysogeny 
decision of phage N15. Analysis of transcription patterns of 
the immA locus show that the structural genes (icd, antA, 
and antB) of the ant operon can only be expressed very 
soon after infection from the two promoters, before the CA 
RNA is produced by processing of the leader region of the 
transcript. The ant operon is then rapidly turned off once 
CA RNA is produced. In the lysogen, the CB repressor turns 
off the promoter Pa, while the second promoter, Pb, allows 
production of the CA RNA immunity factor. Synthesis of 
the CA RNA itself is negatively autoregulated (32) and the 
existence of two differently regulated promoters may be 
particularly important for phage N15 lytic development 
upon prophage induction. A transient inactivation of the CB 
repressor may allow transcription from the stronger promo¬ 
ter, Pa, and the existing CA RNA may not be sufficient to 
prevent transient transcription of the ant operon. This could 
result in the synthesis of antirepressor and a switch to the 
lytic pathway. 

The first gene of the ant operon, icd, is not directly 
involved in antirepressor function. Expression of cloned icd 


instead leads to an immediate arrest of cell division, filamen- 
tation, and finally cell death (32). A possible role for this 
gene in the phage N15 life cycle could be to delay cell divi¬ 
sion soon after infection. On the one hand this may produce 
a “larger cell” that may be more advantageous for lytic 
development. On the other hand, in the case of the lysogenic 
pathway, it may provide sufficient time for expression of 
the N15 DNA replication and partitioning system so that 
each daughter cell will inherit the plasmid prophage once 
cell division resumes. 


Segregational Sability of the Plasmid 
Pophage 

The N15 plasmid prophage is maintained at three to five 
copies per bacterial chromosome and is very stable — its 
rate of spontaneous loss is less than 10~ 4 per generation 
(44). This is much less than would be expected in the case of 
random distribution of plasmid copies between bacterial 
daughter cells, and it implies the existence of special stabil¬ 
ization machinery. Two principal mechanisms ensuring 
stable inheritance of bacterial plasmids have been described: 
active partition of plasmid copies to daughter cells prior to 
division (for review see 18) and post-segregational killing 
of plasmid-free cells (reviewed in 12). It is unlikely that 
the first one is employed by N15 since prophage-free cells 
are easily accumulated at the nonpermissive temperature 
in N15 lysogens carrying “early” temperature-sensitive 
mutations (33). 

Inspection of the complete nucleotide sequence of N15 
revealed a region near the right end of the prophage 
with remarkable similarity to the sop locus, which governs 
partition of F plasmid copies to daughter cells. Ravin and 
Lane (28) demonstrated that the sop locus of N15 in fact 
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determines stability of the prophage since Sop proteins, 
encoded at this locus, can stabilize the partition-defective 
N15 derivatives. The structural and functional organization 
of the N15 sop locus is similar to that of other partition loci 
including F sop and PI par. These loci consist of a two-gene 
operon and an adjacent cis-acting site (see for review 18). 
The first gene (gene 28 = sop A) encodes a protein which 
binds to the promoter of the partition operon to repress 
transcription (the operon is thus negatively autoregulated) 
and which also acts directly in the partition process itself. 
The product of the second gene (gene 27 = sopB) binds to 
the cis-acting centromere site (C) to form a partition complex 
and acts as a corepressor of operon expression. The phage 
N15 and F-plasmid partition functions appear to be partly 
interchangeable: N15 SopA and SopB proteins can partly 
stabilize partition-defective mini-F plasmids and repress 
the F sop promoter, and vice versa. This work (28) revealed 
that the phage N15 partition system, although a func¬ 
tional analog of the F sop system, differs from it in several 
important respects: 

1. The centromere site of phage N15 is not composed of 
a cluster of multiple inverted repeats adjacent to the 
sop operon, as is the case with F plasmids, the plas¬ 
mid-forming phage PI, and other circular plasmids 
(except RK2). Instead, the centromere is represented 
by four inverted repeats located in different regions of 
the N15 genome. Each of these sites binds the SopB 
protein and acts as a centromere (14, 28). 

2. Transcription of the F-plasmid sop operon is driven 
from one autoregulated promoter while transcription 
of phage N15 sop is driven by two major promoters 
(11). The first promoter is similar in sequence and 
function to the F sop promoter, and is repressed by 
Sop proteins. The second and stronger promoter is 
insensitive to regulation by Sop proteins but is tightly 
repressed by protelomerase, the N15 enzyme that 
completes prophage replication by generating hairpin 
telomeres. These promoters establish a regulatory link 
between the partition system and other processes of 
N15 maintenance. 

3. The centromere sites are located in the regions of 
N15 genome that are supposed to be essential for repli¬ 
cation and control of gene expression. One site, IR1, is 
located within the coding sequence of the replication 
gene, repA; the second, IR2, is located within gene Q, 
and the other two centromere sites, IR3 and IR4, are 
located close to the late promoters. This suggests that 
the N15 partition functions may be involved in the 
regulation of gene expression and replication, and 
this is further supported by an observation that the 
increased level of expression of the sop genes influ¬ 
ences the copy number of an N15-based linear 
plasmid (N. Ravin and D. Lane, unpublished data). 

In two other systems the involvement of partition 


genes in the regulation of gene expression seems 
possible: ParB of phage PI binds to the centromere 
site and is able to silence the nearby regions even if 
present at only physiological levels (40), and KorB of 
circular plasmid RK2 binds with different affinities to 
12 sites within the RK2 plasmid (20). 

These properties imply that the phage N15 sop system is 
not an independent functional unit but, like chromosomal 
partition functions, is part of a complex system involving 
replication and regulation of gene expression. Thus, study of 
the N15 partition system may provide key insights into the 
less well understood processes of chromosome partition. 


Mechanism of Replication and Telomere 
Generation 

The N15 Protelomerase 

The N15 protelomerase was first hypothesized by Valentin 
Rybchin as an enzyme responsible for the formation of 
a linear-hairpin prophage molecule from the circular¬ 
ized phage DNA (figure 28-1). In this model the phage N15 
protelomerase is a functional analog of lambdoid phage 
integrases. Sequencing of the N15 genome revealed an open 
reading frame whose translational product, gp29, has some 
sequence homology with the lambdoid phage integrases 
and the product of the BBB03 gene of Borrelia burgdorferi. 
This led to the suggestion that gene 29 (= telN) encodes 
the putative N15 protelomerase (see GenBank AF064539 
annotation; 37:41). 

Deneke and coworkers cloned the N15 gene 29, purified 
the corresponding protein, TelN. and demonstrated that it is 
responsible for processing the telRL site in vitro (9,10). The 
enzyme cuts the target sequence, telRL, and joins the phos- 
phodiester bonds, making hairpin ends in the absence of any 
further N15 or host-encoded proteins. The target 56 bp telRL 
site consists of the central 22 bp palindrome, telO, and two 
14 bp flanking inverted repeats. The telO region has been 
predicted to provide B-Z DNA junctions, which might 
facilitate processing by TelN (41). DNase I footprinting of 
TelN-telRL complexes showed that an approximately 50 bp 
DNA segment is protected by TelN (10). Surface plasmon 
resonance studies demonstrated that two TelN molecules 
bind to telRL, suggesting that their concerted action is the 
basis for telomere resolution (10). Both linear and circular 
supercoiled DNA function as substrates for TelN, indicating 
that negative supercoiling is not required for the reaction. 
These results clearly show that the protelomerase can 
generate the linear prophage molecule after circularization 
of the infecting phage genome (figure28-l). 

Ravin et al. (31) investigated the role of protelomerase in 
N15 prophage replication. They analyzed the protelomerase 
activity in vivo and demonstrated that this enzyme is 
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required for replication of linear (but not circular) N15-based 
plasmids. Protelomerase was found to be an end-resolving 
enzyme responsible for processing of replicative inter¬ 
mediates. The authors demonstrated that the telN gene and 
telRL site constitute an independent functional unit acting 
on non-N15 replicons. Cloning of this module in circular 
mini-F and mini-Pl plasmids resulted in their linearization 
and further maintenance as linear plasmids with hairpin 
telomeres. 

N15-based linear miniplasmids have been used as cloning 
vectors (29, 48) which, presumably due to the absence of 
supercoiling, appeared to be particularly suitable for cloning 
DNA sequences with inverted repeats (30). The functional 
independence of the protelomerase unit opens the prospect 
of constructing the linear derivatives of commonly used 
plasmid vectors. 

Replication Mechanism 

Little is known about the mechanism of replication of 
linear, covalently closed DNA in any biological system. It 
has been suggested (2) that the hairpin telomere is a poten¬ 
tial solution to the problem that DNA polymerases alone 
cannot replicate the extreme ends of linear DNA molecules 
(50). Various models involving processing of replicative 
intermediates by an end-resolving enzyme have been pro¬ 
posed (figure 28-5; for a review see 3). These models could 
be discriminated into several principal alternatives: 
(i) location of the replication initiation site: internal versus 


telomere-proximal; (ii) direction of replication: uni- versus 
bi-directional, (iii) mode of replication: 9 mechanism 
versus something else, and (iv) structure of a replicative 
intermediate processed by an end-resolving enzyme: circu¬ 
lar dimer versus circular monomer versus linear molecule. 

In order to identify the minimal set of genes able to drive 
replication of the N15 prophage, a set of miniplasmids 
consisting of different fragments of N15 DNA and an antibio¬ 
tic resistance gene has been constructed (27). The shortest 
circular miniplasmid contained only gene 37 ( repA ), which 
is thus necessary and sufficient to drive replication of circu¬ 
lar miniplasmid. The replication initiation site (ori) is located 
within the repA gene (27). The shortest constructed linear 
plasmid consists of repA and a protelomerase module (telN 
gene and telRL site). This is in agreement with the data that 
all replication mutations isolated so far have been mapped 
within repA (41,42). 

repA is a large gene, with a predicted protein product of 
1324 amino acids. Sequence analysis of the putative RepA 
protein revealed motifs characteristic of bacterial and viral 
DNA primases and helicases, particularly of the phage P4 a 
replication protein (37) (for review of the phage P4 a pro¬ 
tein, see chapter 26). These observations suggest that RepA 
may be a multi-domain protein similar to the replication 
proteins of bacterial plasmids replicating by the 9 mecha¬ 
nism. Further arguing for the existence of primase and 
helicase activities of the RepA protein, phage N15 is able to 
productively infect E. coli strains carrying temperature- 
sensitive mutations in primase (dnaG3) and helicase 
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( inaB70) genes at nonpermissive temperature (N. Ravin 
et al., manuscript in preparation). In addition, phage N15 
replication was found to be independent of E. coli polA, recA 
(41), dnaj, dnaK, and (jrpE (46) genes. 

To determine the direction of plasmid replication we 
have integrated the phage N15 replication region, including 
repA, its promoter, and cB repressor gene, into the E. coli 
chromosome at the X integration site. Direction of replication 
was determined by the analysis of which of the two nearby 
bacterial DNA markers flanking this site were amplified 
when the replicon was activated. We found (27) that both 
DNA markers are coordinately amplified, a result consistent 
with replication proceeding in a bidirectional fashion. 

The next principal point allowing discrimination 
between different models of replication is the structure of 
the replicative intermediate processed by the end-resolving 
enzyme. Ravin and coworkers (31) constructed an N15 
mutant carrying a deletion in the protelomerase gene and 
then cloned the telN gene into an expression vector under 
the control of a regulable promoter. The mutant may be 
maintained as a linear plasmid if the telN gene is expressed 
from the vector plasmid present in the same cell; repression 
of the telN gene results in accumulation of unprocessed 
replicative intermediates which were found to be circular 
head-to-head dimer molecules (31). 

Electron microscopic analysis of intermediates generated 
in the course of replication of an N15-based linear plasmid 
allowed identification of three types of replicating molecules 
(27). Type 1 molecules, which are linear and the length of 
the linear plasmid, contain an internal “bubble” located 
near the position predicted for the ori site. These molecules 
likely represent an early step of internally initiated bidirec¬ 
tional replication. Type 2 molecules are circles located at 
the end of a linear DNA: these could result from replication 
to one end of the molecule without protelomerase resolution 
of the ends (most likely the left end, which is closer to the ori 
site than the right end). Type 3 molecules are Y-shaped 
molecules with two equal-length arms whose lengths are 
consistent with a single fork on the linear plasmid; these 
could result from TelN cleavage of the circle in a type 2 mole¬ 
cule. No circular molecules, either dimers or monomers, 
were found. 

All these data suggest the following model of N15 plasmid 
replication (figure 28-5; reference 25), which is based on 
Batemans model of replication of palindromic telomeres (2), 
and is largely in agreement with the one proposed previously 
for the N15 prophage (41). Replication is initiated from an 
internal ori site, located within repA, follows the 9 mode of 
DNA replication, and proceeds bidirectionally. After duplica¬ 
tion of telL, protelomerase cuts this site creating hairpin 
ends, and thus a Y-shaped structure is formed. After the 
replication of the right telomere and subsequent cutting, 
two linear molecules are produced (figure 28-5, pathway A). 
Alternatively, under particular conditions, full replication 
of the molecule with the formation of a full head-to-head 


circular dimer may precede end resolution (figure 28-5, 
pathway B). 

The above results suggest that N15 prophage replication 
could serve as a model for replication of other bacterial 
replicons with hairpin ends, such as the linear plasmids 
and chromosomes of Borrelia burgdorferi. Replication of these 
plasmids is initiated at an internal ori site and proceeds 
bidirectionally (24); sequence analysis has revealed regions 
of homology shared by the TelN protein and the Borrelia 
gene BBB03 product (37, 41). Chaconas et al. (5) showed 
that a synthetic sequence having the predicted structure of 
a replicated telomere functions as a substrate for telomere 
resolution in vivo and is sufficient to convert a circular repli¬ 
con into a linear form. The authors suppose that the final 
step in the replication of Borrelia plasmids and chromo¬ 
somes is a site-specific telomere breakage and reunion that 
occurred on the circular-dimer substrate. Later, Kobryn and 
Chaconas (19) showed that the BBB03 gene in fact encodes 
the telomere resolvase and demonstrated its activity in vitro. 
Thus, it seems likely that replication of Borrelia plasmids 
follows the same mode as that of phage N15. Genes highly 
similar to telN were also found in linear phage-plasmids 
PY54 (16) and phiK02 (4). The cleavage-joining activities 
of PY 54 protelomerase have also been shown in vitro (16). 

Interestingly, the N15 model of replication differs from 
models suggested for eukaryotic replicons with hairpin ends. 
Particularly, poxvirus replication (8, 47) is initiated in the 
telomeric region, resulting in the formation of head-to-head 
and tail-to-tail concatemers through strand-displacement 
replication; the duplicated telomeres in concatemers are 
subsequently resolved by an as yet unknown enzyme to 
generate linear monomeric molecules with hairpin ends. 
These observations suggest the possibility of evolution- 
arily independent appearances of prokaryotic and eukary¬ 
otic replicons with covalently closed telomeres, rather than 
transfer of protelomerase genes from eukaryotes 
(poxviruses) to prokaryotes ( Borrelia) or vice versa, as has 
been previously suggested (17). 

Conclusions 

Phage N15 is unique among bacteriophages in its genetic 
organization. Its morphogenetic genes and prophage repres¬ 
sor gene are similar to those of the lambdoid phage family. 
At the same time, N15 early operons contain mostly genes 
which reflect the prophages unique linear-plasmid life¬ 
style: the protelomerase gene, the antirepression operon, 
genes for replication and partition functions, and many 
others with as yet unknown functions. The specific mecha¬ 
nism of N15 prophage replication seems to be a combination 
of typical bidirectional 9-type strategy and action of the 
specific end-resolving enzyme, protelomerase. This mecha¬ 
nism could serve as a model for other replicons of such type. 
Phage N15 must have arisen either through a lambdoid 
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progenitor’s accumulation of new genetic modules from 
plasmid and bacterial sources or by an unknown plasmid 
that acquired a lambdoid set of “virion" genes. Therefore, 
N15 provides a very interesting model system for the study 
of phage and plasmid evolution as well as for the study of 
interactions between phages, plasmids and bacterial hosts. 
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Bacteriophage P22 

PETER E. PREVELIGE, Jr. 


B acteriophage P22, a relative of bacteriophage X, is a 
temperate phage of Salmonella typhimurium and has 
played key roles in the development of molecular biology. 
Originally isolated as a lysogen of Salmonella, it was with 
P22 that Zinder and Lederberg discovered the phenomenon 
of generalized transduction (113). In subsequent years, P22 
has proven to be an important tool in Salmonella genetics, 
and equally importantly has served to illustrate key compo¬ 
nents of genetic and morphogenetic regulation (45). P22 
shares many similarities in genetic structure and regulation 
with phage X and there are excellent reviews of P22 biology 
(63, 88). The genome of P22 has been sequenced (11, 24, 61, 
72,100) and 65 genes have been annotated. The sequencing 
results support the hypothesis that phage P22 is a mosaic 
that has evolved through extensive recombination with 
other viruses. Significant progress has been made over the 
past 15 years in understanding the structural biology of 
P22 infection and replication which, after a brief overview, 
will be the focus of this chapter. 


Life Cycle 

Phage P22 enters the cell via the initial binding of the gp9 
tailspike trimer to the O-antigen of the host lipopolysaccar- 
ide (2, 39, 40). Initial binding is followed by binding to a 
second receptor, and the subsequent ejection of the “E” or 
ejection proteins whose activity is required for active DNA 
to enter the cell (36, 38). Following infection, phage P22 can 
enter either a lytic or a lysogenic growth pathway. In the lytic 
pathway, viral replication proceeds immediately following 
infection and culminates approximately 1 hour later in the 
release of 300-500 phage progeny through cell lysis. In the 
lysogenic pathway, the phage chromosome integrates into 
the host chromosome and is passed to daughter cells during 
host cell division. The primary factor controlling this deci¬ 
sion is the multiplicity of infection (moi). High moi favors 
lysogeny whereas low moi favors the lytic pathway (50). As 
will be discussed below, maintenance of the lysogenic state 


is an active process involving multiple repressor systems. 
Upon receipt of an appropriate trigger the prophage is 
excised and enters the lytic pathway. 

Lytic Growth 

In the case of the lytic cycle, the first genes to be expressed 
following DNA entry, are the “immediate early genes” which 
lie adjacent to the c.2 repressor gene and are transcribed by 
the host polymerase from the P R , P L and P ant promoters 
(figure 29-1). These early genes code for functions involved 
in DNA replication, recombination, and regulation of gene 
expression. There are also two regulatory genes, genes 23 
and 24, both of which function as anti-terminators. Gene 
24, whose expression is driven off the P L promoter acts 
as an anti-terminator similar in function to the X phage N 
gene (35, 52,62). In the absence of gene 24 function, the tran¬ 
scripts initiated from promoters P L and P R terminate before 
the early genes are transcribed. Given gene 24 expression, 
instead efficient transcription occurs. Similarly, the product 
of gene 23, driven off the P R promoter, is also an anti¬ 
terminator which anti-terminates transcription originating 
from the promoter, Plate (7, 52). The result of gene product 23 
anti-termination is a 20,000 base transcript that encodes the 
genes required for phage assembly and release. The phage 
DNA itself is linear and is circularized by recombination 
and subsequently replicated by a rolling-circle mechanism. 
The resulting concatamer is packaged into the host by a 
headful mechanism which initiates at a specific site termed 
the “pac” site (14,47,74) (see chapter 6 for a general overview 
of DNA packaging by double-stranded DNA phage). Approxi¬ 
mately 43,500 bp of DNA are packaged into the capsid 
(12). Since the genome comprises 41,724 bp in length, there 
is a terminal redundancy of approximately 4% in the pack¬ 
aged DNA. Packaging is processive with subsequent rounds 
beginning where the previous round terminated. The 
products of genes 2 and 3 are required for DNA pack¬ 
aging and act as a complex. Genetic evidence suggests that 
it is gp3 that initially recognizes the pac site (16, 107). 
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Figure 29-1 The genetic map of bacteriophage P22. The genes are positioned above the line (not to scale), with their 
function indicated above. Promoter regions, and the loci of repressor action, are indicated below the line. The c2 
repressor protein inhibits synthesis of the early genes, whose transcription is driven from the two promoters, P L and P R . 
Expression of the anti-terminators, 24 and 23, result in expression of the early and late genes, respectively. Notice that genes 
with related functions are clustered. 


Generalized transduction is a result of packaging initiating 
at a pflolike site in the host cell chromosome with the result¬ 
ing generation of a series of transducing particles (75, 76). 

Lysogeny 

In the lysogenic state, the P22 chromosome integrates into 
the host cell chromosome at a specific site, termed the attB 
site (49, 79). Integration is facilitated through the action of 
the protein product of the P22 int gene (77) with the assis¬ 
tance of a host-encoded integration host factor (IHF) whose 
function does not appear be essential (18). In the lysogenic 
state, relatively few P22 proteins are produced. The genes 
responsible for phage growth, and ultimately for host cell 
death, are repressed through the action of two repressor 
proteins, c2 and mnt. The c2 protein acts at the immunity 
region ( immC ) in a manner analogous to that of 7. repressor 
(1, 20, 35). That is, it binds to 0 L and 0 R to prevent tran¬ 
scription of early genes. The unique inmil region includes 
an antirepressor, ant, and three repressors: mnt, arc, and sar 
RNA. Mnt acts to repress the transcription of the Ant antire¬ 
pressor (73,101,103). If mnt is turned off. Ant is synthesized. 
Ant binds to and inactivates the c2 repressor protein with 
the result that the prophage enters the lytic pathway (8, 51, 
87). The ant gene is transcribed from the strong P ant pro¬ 
moter. The production of Ant itself is regulated through 
the action of the arc gene whose protein product binds 
within P ant , thereby repressing Ant synthesis. In arc 
mutants high levels of Ant are produced, and these levels 


are sufficiently high to interfere with the synthesis of other 
essential phage proteins (86,110). 

The ant gene lies downstream of Plate and is additionally 
transcribed during the lytic cycle as part of the late operon. 
Despite transcription from the Plate promoter, the Ant 
protein is not subsequently synthesized. An additional pro¬ 
moter, P sar , lies within the ant gene and directs the synthesis 
of a 69 nucleotide antisense RNA that binds the ant messen¬ 
ger RNA and inhibits translation (53,108). 

Both Mnt and Arc have been studied extensively as model 
DNA binding proteins. Mnt is functional as a tetramer and 
binds a 17 bp operator (101, 105). Mnt is comprised of two 
structural domains: a dimeric N-terminal domain, and a 
tetrameric C-terminal domain. The dimeric N-terminal 
domain is responsible for operator binding and specificity. 
Both arc and mnt are members of the ribbon-helix-helix 
family of transcription factors in which antiparallel P-sheet 
motifs, rather than helix-turn-helix motifs, are used to bind 
the operator DNA. In Mnt, the C-terminal tetramerization 
domain is comprised of two antiparallel a-helices with 
asymmetrical helical packing and a unique right-handed 
twist (57). Arc functions as a dimer and two dimers bind 
a 21 bp operator site in a highly cooperative reaction (9, 
102). Detailed mutagenesis and folding studies on Arc have 
resulted, among other things, in a deeper understanding 
of the role of the hydrophobic core in protein stability (54), 
the importance of cooperative interactions in DNA binding 
(78), and the impact of effective concentration on protein 
stability (70). 
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Structure and Assembly 

Significant advances in characterizing the structure and 
assembly of bacteriophage P22 have been made over the 
past 15 years. The assembly of infectious virions consists 
of two independent, linear pathways: the assembly of the 
capsid and the assembly of the tail. These two subassemblies 
then associate to form the infectious virion. Interest in the 
assembly of the capsid has stemmed from the observation 
that a “scaffolding” protein is required for proper form deter¬ 
mination (15, 44). In the absence of scaffolding protein the 
coat protein subunits, which comprise the capsid, assemble 
into aberrant forms which are largely unclosed (22). The 
use of a scaffolding protein to guide assembly is not unique 
to bacteriophage P22. However, unlike the scaffolding pro¬ 
tein of other phages, the scaffolding protein of phage P22 
is not proteolytically cleaved during morphogenesis, a fact 
that significantly simplifies biochemical studies of assembly 
and maturation. 

In the case of the tailspike protein, the properly folded 
and assembled form is SDS resistant whereas intermedi¬ 
ates in folding and assembly are SDS-sensitive (28, 29). This 
fact has made possible quantitative investigations of pro¬ 
tein folding in vivo using simple SDS-PAGE analysis as an 
analytical technique. Investigation of tailspike folding and 
scaffolding-protein-mediated P22 head assembly have 
provided mechanistic insight into protein folding and the 
control of biological self assembly. 

Assembly of the Capsid 

As is typical of double-stranded DNA phages, the P22 
capsid assembles through the formation of a procapsid 


intermediate (43, 46). Approximately 300 molecules of the 
33 kDa scaffolding protein co-assemble with 420 molecules 
of the 47 kDa coat protein to form a T = 7 procapsid 
(figure 29-2). During assembly one vertex is differentiated 
from the other 11 by the presence of a dodecameric complex 
of the 88 kDa portal protein (3, 4). This portal vertex serves 
as the conduit for DNA entry during packaging and egress 
during infection. In addition to the portal protein, approxi¬ 
mately 12 copies each of three minor E or ejection proteins 
(gp7, gpl6, and gp20) are incorporated into the procapsid. 
Although both their mechanism of action and location 
within the procapsid are unknown, these proteins are 
required for the delivery of an infectious viral genome to 
the host cell, and transit from the phage to the host cell 
during infection (36). 

The 42 kb viral DNA is replicated as a concatemer by 
the rolling-circle mechanism and packaged in a headful 
fashion. Concomitant with DNA packaging the scaffolding 
protein exits the capsid and is capable of sequentially 
catalyzing approximately five rounds of assembly. DNA pack¬ 
aging induces a conformational change within the lattice 
resulting in head expansion and the appearance of angular¬ 
ity. The portal vertex is closed by the sequential addition 
of the products of genes 4, 10, and 26 and the stable capsid 
is now ready for tailspike addition (83). 

Structure of the Capsid 

The overall dimensions of the P22 procapsid and phage were 
determined by negative-stain electron microscopy as well as 
small-angle X-ray scattering by King and colleagues in the 
late 1970s (10, 21, 22). The P22 capsid lattice undergoes an 
expansion of approximately 10% from 580 nm to 610 nm 
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Figure 29-2 Morphogenetic pathway of P22. Approximately 420 molecules of the coat protein (gp5) co-assemble with 
approximately 300 molecules of the scaffolding protein (gp8). The minor (E) proteins, gp7, gp16, and gp20, as well as 
the portal protein (gpl), are incorporated during the early stages of assembly. The contameric double-stranded DNA is 
delivered to the portal vertex by the action of a complex composed of gp2 and gp3 and packaged in a headful mechanism. 
Scaffolding exit is concomitant with packaging. Once a headful of DNA is packaged (~104% of the genome), the DNA is 
cut and the portal vertex stabilized by the addtion of gp4, gpIO, and gp26 (see chapter 6 for a general discussion of 
DNA packaging). Up to six tailspike trimers (gp9) are now able to attach, rendering the phage infectious. 
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Figure 29-3 Cryo-electron micrographic image reconstructions. A: the P22 prohead. B: The mature virion. Both are hexameric 
and pentameric T = 7 lattices. The procapsid has a diameter of approximately 58 nm and is roughly spherical, while the 
phage has a diameter of 61 nm and displays pronounced angularity. Figure courtesy of Matthew Baker. See 
thebateriophages.org/frames_0290.htm for a color version of this figure. 


during DNA packaging, as is typical for a double-stranded 
DNA phage. Both procapsid and capsid lattices appear to be 
T = 7 subtriangulated icosahedra based on freeze-fracture 
electron microscopy. 

Subsequent years have seen the development of cryo- 
electron microscopy and image reconstruction as a powerful 
tool for the analysis of the structure of viral capsids. Based 
on cryo-electron microscopy image analysis, both procap¬ 
sids and mature virions have been shown to beT = 7 lattices 
of hexamer- and pentamer-clustered coat-protein subunits 
as originally predicted by Caspar and Klug (17, 64). The 
procapsid form is indeed smaller and appears relatively sphe¬ 
rical whereas the capsid lattice is larger and appears more 
angular (figure 29-3). A number of conformational changes 
within the lattice are responsible for the alterations in 
appearance. The procapsid lattice has holes, approximately 
25 nm in diameter, located in the center of each hexameric 
coat-protein cluster. It has been postulated that these holes 
provide the exit port through which scaffolding leaves 
during DNA packaging (64). Furthermore, the coat-protein 
hexamers display a skewed character in the procapsid (93), 
which disappears leaving instead relatively symmetrical 
hexamers in the mature virion (112). This rearrangement 
appears to be accomplished through an outward move¬ 
ment of trimerically clustered subunits at the strict and 
local 3-fold axes (112), coupled with a rotational movement 
within the hexamer (42). 

Cryo-electron microscopy provides the opportunity to 
visualize macromolecular disposition within the capsid as 
well as on the surface. Examination of procapsids revealed 
that the scaffolding was not globally icosahedrally ordered 
(93). However, comparison of procapsids and procapsid- 
Iike particles lacking scaffolding protein suggest that small 


regions of the scaffolding protein may contact the inner 
surface of the coat protein lattice at four of the six hexameric 
coat subunits (94). The observation that the scaffolding 
protein itself is not icosahedrally ordered within the procap¬ 
sid, yet is responsible for modulating icosahedral assembly 
of the capsid, poses something of a dilemma. Where within 
the scaffolding protein does the information reside that 
determines icosahedral symmetry? 

This problem is somewhat simplified by the observation 
that in the absence of scaffolding protein and at elevated 
temperature, the coat protein can polymerize into T = 4 as 
well as T = 7 procapsids. Cryo-electron microscopy recon¬ 
structions of these T = 4 particles indicate that the hexa¬ 
meric and pentameric clusters of coat protein are nearly 
identical in the two forms (95). This suggests that the role of 
the scaffolding protein is to modulate the relative positioning 
of hexamer- and pentamer-clustered coat protein into the 
appropriate lattice. However, as will be discussed below, in 
the case of phage P22 it does not appear that preformed 
hexamers and pentamers are the building blocks for 
assembly as has been suggested for phage HK97 (109). 

Structure and Folding of the Coat Protein 

While the procapsid, capsid, and coat protein subunits have 
proven refractory to crystallography to date, biochemical 
analysis has yielded insight into the structure of the coat 
protein. Circular dichroism and Raman spectroscopies indi¬ 
cate that the coat protein is a mixed a/(3 protein, composed 
of approximately 20% a-helix and 25% (1-sheet (67, 90, 92). 
Proteolytic digestion studies indicate that the coat protein 
is composed of two structural domains of approximately 
equal size (48). The hinge region, which resides in the 


BACTERIOPHAGE P22 461 



Figure 29-4 The cryo-electron micrographic structure of the 
coat protein. The cryo-electron microscopic electron density 
obtained from image reconstructions of P22 phage is shown 
in the cage structure and the fit of a ribbon diagram from 
the HK97 coat protein crystal structure is shown within the 
electron density map. See thebateriophages.org/ 
frames_0290.htm for a color version of this figure. 


residues 180-205, connects the two domains and becomes 
increasingly protected during capsid assembly and matura¬ 
tion (48, 96), a property shared with the capsid proteins 
of phages T4 and HK97 (41,106). While there is no sequence 
homology, based on cryo-electron microscopy reconstruc¬ 
tions it seems that the fundamental fold of these subunits 
as well as their packing within the capsid lattice is conserved 
between phages HK97 and P22 despite the fact that the 
phage P22 coat protein is substantially larger (429 residues 
versus 282) (figure 29-4) (42). 

In complex macromolecular systems the boundaries 
between protein folding and assembly become blurred and 
consequently the P22 coat protein has been the focus of a 
number of folding studies. Temperature-sensitive mutations 
in coat proteins display a temperature-sensitive folding (tsf) 
phenotype in which they are temperature-labile during fold¬ 
ing but display stability similar to wild-type once folded and 
assembled (30). At elevated temperatures, the tsf mutants 
form low-molecular-weight oligomers, (i.e., dimers, and 
trimers), which are incapable of assembly (89). The observa¬ 
tion that 17 tsf mutations in the coat protein could be 
rescued by overexpression of the host chaperone, groEL, sug¬ 
gested chaperone-assisted folding (31). It proved possible 
to co-hnmunoprecipitate coat protein and groEL from cell 
lysates. In vivo experiments suggest that coat proteins parti¬ 
tion between aggregation and assembly, and that the ratio 
of the fraction in each form can be altered by the presence 
or absence of groEL. Biophysical studies of the interaction 


of wild-type and mutant coat protein with groEL suggest 
that the subunits have considerable secondary and tertiary 
structure when bound to groEL. This in turn is suggestive of 
an interaction between groEL and the coat protein at a late 
folding stage. Interestingly, a cluster of amino acids near 
the C-terminus of the N-terminal domain act as intragenic 
global suppressor mutations that alter the balance between 
folding and aggregation. 

The coat protein itself has unique biochemical properties. 
For example, even when properly folded the subunits are 
only marginally stable. Calorimetric studies indicate that 
that the melting temperature is near 37°C, a temperature 
where the phage itself is fully infectious. However, when 
assembled into the capsid lattice, the subunits are substan¬ 
tially stabilized and temperatures in excess of 80°C are 
required to denature both procapsid and phage lattices 
(26, 27). Thus, the subunits appear to be in a relatively flexi¬ 
ble conformation, and derive the bulk of their stabilization 
from inter-lattice contacts. This property may confer the 
flexibility required both to adopt the “quasi-equivalent” 
conformations required to form a T = 7 lattice and to 
undergo the conformational changes required for expansion 
during DNA packaging. Hydrogen/deuterium exchange 
studies performed both with Raman and mass spectrometry 
have identified the regions of the coat protein which become 
stabilized upon assembly and maturation (96, 98). The 
expansion that accompanies packaging can be mimicked 
by gentle heating of the procapsids. While the transforma¬ 
tion displays a high activation energy it interestingly is 
also exothermic, suggesting that the procapsid is a meta¬ 
stable, spring-loaded structure poised to rearrange to a 
lower energy form during DNA packaging (27). 

Determining the structure of the packaged DNA within 
P22, or phage in general, has proven a formidable challenge 
and a number of models have been proposed. In the case 
of phage P22, Raman spectroscopic studies have indicated 
that the DNA remains in the B-form, and is uniformly 
curved rather than kinked (68, 92). Cryo-electron micro¬ 
scopy studies have demonstrated that the strands of the 
DNA are close-packed within the phage head with an aver¬ 
age spacing of 2.5 nm. The coaxial spool model, as proposed 
by Earnshaw and Harrison (23), appears to be consistent 
with all extant data (112). 

Structure of the Scaffolding Protein 

The 33 kDa P22 scaffolding protein was predicted to be a 
highly a-helical protein, and this prediction has been borne 
out by experiment. Both circular dichroism spectroscopy 
and Raman spectroscopy suggest the presence of appro¬ 
ximately 37% a-helix in the native protein (90, 92, 99). 
Hydrodynamic measurements suggest that the protein is 
a long rod-like molecule (25, 60). Both hydrogen/deuterium 
exchange studies, in which no exchange-protected core was 
observed, and calorimetric studies, in which no cooperative 
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melting profile was observed, suggest that the molecule 
is highly flexible in solution (27, 97, 99). Analytical ultra¬ 
centrifugation experiments demonstrated that the scaffold¬ 
ing protein is in a monomer-dimmer-tetramer equilibrium 
in solution. A naturally occurring mutation at residue 47 
introduces a single cysteine into the protein. This mutant 
spontaneously forms dimers which are biologically active, 
suggesting that the dimeric form is likely to be the assembly- 
active species (60). 

The C-terminal region of the scaffolding protein 
comprises at least part of the region that interacts with the 
coat protein. Deletion of the C-terminal 11 amino acids 
renders the scaffolding protein incapable of binding to coat 
protein, while scaffolding protein in which the N-terminus 
has been deleted up to residue 141 is capable of promoting 
procapsid assembly, albeit with somewhat reduced fidelity 
(58). The nulear magnetic resonance (NMR) structure of 
the C-terminal 35 residues of scaffolding protein revealed 
that the coat binding domain was a helix-loop-helix motif 
in which residues from each of the helical segments pack 
together to form a hydrophobic core (85). The fold of the 
coat binding domain of the scaffolding protein is struc¬ 
turally homologous to the tetratricopeptide repeat motif, 
a motif used to modulate protein-protein interactions. The 
surface of the coat binding domain is strongly positively 
charged, suggesting the likelihood of electrostatic binding. 
In accord with this suggestion, the presence of high salt 
concentrations blocks the binding of the scaffolding protein 
to preformed immature shells of coat protein (59). 

While the C-terminal 35 residues (residues 269-303) are 
structured, the adjacent 31 residues (238-268) are highly 
flexible in solution. This result explains the cryo-electron 
miscroscopy data in which it was determined that the scaf¬ 
folding protein is not icosahedrally disposed (93). Presum¬ 
ably the coat binding domain is anchored at the inner 
surface of the procapsid but the remainder of the scaffolding 
protein molecules are free to move around. It is not known 
whether the structure determined for the scaffolding pro¬ 
tein represents the conformation when it is bound, or 
whether there is a conformational change in the scaffolding 
protein during the release step. 

Scaffolding exit during DNA packaging appears to be 
an active process. Two mutations in the scaffolding protein, 
L177I and Q149W result in failure of the scaffolding protein 
to be released and therefore these mutations block 
DNA packaging (32). This is a somewhat curious result 
because truncated versions of scaffolding protein, in which 
this region is missing entirely, are capable of both procapsid 
re-entry and exit in a model system. 

Scaffolding Protein and Assembly 

The development of an in vitro assembly system in which 
purified coat protein and scaffolding protein subunits could 
be induced to assemble into procapsid-like particles proved 


a significant advance in understanding the assembly of 
phage P22 and double-stranded DNA phage in general (66). 
The in vitro system recapitulates the need for scaffolding 
protein for the high-fidelity assembly seen in vivo. Impor¬ 
tantly, for assembly this system requires only the simple 
mixing of coat and scaffolding protein without the need 
for solvent change such as pH or divalent cation alterations. 
Therefore, the components can be studied independently 
under assembly conditions. 

As described above, the scaffolding protein exists in 
a monomer-dimmer-tetramer equilibrium in solution, a 
theme reiterated in other scaffolding proteins. The isolation 
of a mutant which spontaneously forms disulfide- 
crosslinked dimers (R74C) allowed for a direct demonstra¬ 
tion of the activity of the dimeric form in vivo. Subsequent 
molecular biology-based dissection experiments indicated 
that the C-terminus of the scaffolding protein is required 
for coat protein binding and hence activity. The NMR- 
determined structure of the coat binding region revealed 
that this C-terminal region consists of a helix-turn-helix 
motif structurally analogous to the tetratricopeptide repeat 
motif (TPR) shown to be important for protein-protein inter¬ 
actions (figure 29-5). The helix-loop-helix is highly basic, 
and a significant component of the interaction between the 
coat and scaffolding proteins is electrostatic in nature. 

One simple model by which scaffolding protein could 
control form determination is the “preformed core" model, 
in which scaffolding protein spontaneously associates to 
form a micellar-type structure whose outer surface is 
subsequently tiled by coat protein. Such a model has 
been proposed for T4 assembly based on the observation 
that cells infected with mutants that do not express coat 
protein accumulate scaffolding protein cores. However, 
this mechanism does not appear to be operative in P22. 
Under conditions where the scaffolding protein is active 
for assembly, no preformed cores can be detected in solution 
(60,66). 

A second model is one in which scaffolding protein is 
required only to initiate the assembly process. Subsequent 
growth steps can occur without the need for scaffolding 
protein. This model can be addressed by lowering the 
amount of scaffolding protein available for assembly, and 
determining the amount required for procapsid formation. 
This is a difficult experiment to perform in vivo because 
scaffolding protein appears to regulate its own level of 
synthesis at the post-transcriptional level, presumably by 
interacting with its own messenger RNA. However, this 
experiment can be performed in vitro, with the results 
indicating that a minimum of approximately 120 molecules 
of scaffolding protein are required for the assembly of a 
stable procapsid. Thus, it appears that scaffolding protein 
is required throughout the assembly process (66). 

The kinetics of procapsid assembly have been studied 
in vitro to dissect the assembly pathway. Based on the 
presence of a lag phase, a critical concentration of coat 
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Figure 29-5 The solution structure of the C-terminus of the 
scaffolding protein as determined by NMR. The region 
displayed corresponds to residues 264-303. Note the two 
helices packed with a five-residue p-turn. Note also the basic 
lysine and arginine residues whose side chains point out into 
solution. See thebateriophages.org/frames_0290.htm for a 
color version of this figure. 

protein for assembly, and the paucity of assembly intermedi¬ 
ates, the overall assembly pathway appears nucleation- 
limited (65). A nucleus size of five coat protein subunits was 
deduced from the concentration dependence of the assembly 
reaction. Thus, it appears that nucleation of assembly occurs 
at a pentamer of coat protein, followed by outward growth. 
The growth appears to occur by the addition of coat protein 
monomers rather than by the addition of preassembled 
substructures such as pentamers, as has been proposed 
for phage HK97. It seems likely that this is a reflection of 
the relative strengths of the intersubunit interactions, 
and not a fundamental difference in the design of assembly 
pathways, which instead would seem likely to be conserved. 
It is not known whether subsequent subunits can add to 
every growing point, or if there is a preferred sequence of 
additions. 

A biologically active procapsid contains not only coat and 
scaffolding protein but also a differentiated portal vertex 
and the ensemble of “E” proteins. The E protein gpl6 can be 
incorporated into nascent procapsids in vitro if present early 
in the assembly reaction. This protein also is incorporated 


at the proper stoichiometric amounts even if supplied in 
excess. Together these observations suggest that only coat 
and scaffolding protein are required for controlled incor¬ 
poration of the E protein. (91) Indeed, mutants in scaffolding 
protein have been isolated which do not incorporate gpl6. 
In contrast, the gpl portal protein has not been success¬ 
fully incorporated into nascent procapsid in vitro whether 
supplied in monomeric or dodecameric form. 

Based upon the kinetic analyses and structural studies, 
a model of how scaffolding protein might control form 
determination has been proposed. In this model, the role of 
the scaffolding protein is to stabilize otherwise unstable 
subunit additions by binding in key locations during 
assembly. The scaffolding protein thus may act to both 
stabilize and “energetically steer” the assembly process 
toward aT = 7 lattice (94). 

Incorporation of the Portal 

While there are a number of possible explanations for 
the failure of the portal protein to be incorporated during 
assembly in vitro, a likely one appears to be that an adaptor 
molecule is missing. The portal protein is a dodecameric 
structure that lies at a 5-fold vertex. Thus with P22, as with 
the other double-stranded DNA phage, there is a symmetry 
mismatch at the portal vertex. It has been proposed that the 
symmetry mismatch simplifies rotation because at no point 
during rotary motion would all the intersubunit interactions 
be either in phase or out of phase (34). A number of different 
adaptor molecules have been considered including messen¬ 
ger RNA and GroEL (6). At this point the existence of an 
adaptor molecule remains unproven. 

The portal vertex presents a second level of asymmetry 
defined by the presence of a single portal vertex that is 
differentiated from the other 11 otherwise identical vertices. 
The mechanism by which P22 incorporates a single portal 
complex during assembly also remains obscure. In vivo 
virtually all proheads are capable of packaging DNA and 
converting to phage, implying the presence of a portal 
complex in nearly every procapsid-like particle. Virions with 
two portal complexes are extremely rare. The most straight¬ 
forward way to insure the incorporation of one and only one 
portal complex is to couple portal incorporation to nuclea¬ 
tion of assembly. However, mutant phage which do not 
express portal protein assemble procapsid-like particles at a 
rate identical to the assembly rate in the presence of portal 
protein (5). Thus, obligate incorporation during nucleation 
does not appear to be a viable model. A second model, in 
which the portal protein adds last, has been suggested for 
phage T7, and was explored by Moore and Prevelige (55). 
Pulse-chase experiments indicate that the portal protein 
cannot be added after prohead-Iike particles have been 
assembled. Thus, it appears that the portal protein is incor¬ 
porated during the growth phase of assembly, perhaps with 
the help of an adaptor molecule. 
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Morphologically, the phage P22 portal vertex is similar to 
the portal protein complexes of other double-stranded DNA 
bacteriophage, it consists of a dodecamer of the portal sub¬ 
units arranged in a ring with a diameter of approximately 
180 nm. The ring has a central channel with a diameter of 
approximately 3 nm, which is of sufficient width to allow 
the passage of DNA (3). Spectroscopic studies reveal that 
it is a highly a-helical protein, and that it is possible to 
align predicted elements of secondary structure in the P22 
portal protein with the core region of the known crystal 
structure of the phage ®29 portal protein (56, 71). The P22 
portal protein has been crystallized and crystallizes as 
a dodecamer (19). In contrast to 4>29, there is no evidence 
for RNA involvement with the P22 portal protein or DNA 
packaging. 

Packaging of the concatemeric double-stranded DNA is 
modulated by the actions of the products of genes 2 and 3, 
which together form a complex that is stabilized by ATP. 
Gp3, which corresponds to the terminase small subunit, is 
a 18.6 kDa protein and is responsible for recognition of 
the pac site (12-14,16) The pac site itself is an asymmetrical 
22 bp region that lies within gene 3. Gp3 recognizes the pac 
site and makes a cleavage near it, generating a free end 
of DNA. This free end is then threaded into the procapsid 
through the portal complex until the head is full. At 
that point, the headful nuclease, presumably gp3, cleaves 
the DNA. Subsequent rounds of packaging begin at the 
newly generated end. The pac site corresponding to the 
first cleavage event does not lie at the end of the packaged 


DNA but rather approximately in the center of a 120 bp 
DNA end region (107). Whether this reflects the assembly 
of a complex of several molecule of gp2 and gp3, or slipping 
of the gp2/gp3 complex prior to cutting, is unclear. Gp2 is 
a 57.6 kDa protein which corresponds to the terminase 
large subunit and contains three predicted ATP bind¬ 
ing sites. Gp2 is presumably involved in translocation of 
the DNA. 

Structure of the Tailspike 

Phage P22 infection of Salmonella begins with tailspike 
recognition of the O-antigen of the Salmonella lipopolysac- 
charide (LPS). The active tailspike is composed of a trimer 
of the 72 kDa product of gene 9 and three to six tailspike 
trimers bound to the phage capsid are required for infectiv- 
ity (37, 38). The structures of a C-terminal fragment of tail¬ 
spike (residues 109-666), an N-terminal fragment (residues 
1-124), and the tailspike bound to the LPS have been solved 
crystallographically (80-82). As suggested by spectroscopic 
techniques, the tailspike is a highly (3-sheet-rich protein and 
can be divided into four distinct domains (figure 29-6). The 
N-terminal 100 amino acids of each subunit are folded into 
an antiparallel P-sheet. Residues 143-540 form a 13-turn, 
right-handed, parallel P-coil. Resides 540-620 form an 
interdigitated P-sheet which then separates. The remaining 
amino acids are folded into individual C-terminal P-sheets. 
The overall appearance of the folded trimer is that of a fish, 
with the N-terminus forming the head, the parallel P-coil 



Figure 29-6 The crystal structure of the tailspike trimer. The structure of residues 109-666 (the C-terminus) of wild-type P22 
tailspike protein is shown in ribbon-diagram form. The N-terminus of the molecule lies to the left of the figure, and the 
C-teminus lies to the right. The central region (the body of the fish) corresponds to the parallel p-coil structure. 

See thebateriophages.org/frames_0290.htm for an in-color version of this figure. 
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forming the body, and the C-terminal domain forming a 
caudal fin. 

The binding site for the LPS is a cleft located in the central 
part of the P-coil which accommodates all eight carbo¬ 
hydrate residues of two repeating O-antigenic units (80). 
The tailspike protein displays receptor-destroying activity 
by cleaving the glycosidic bond between rhamnose and 
galactose. However, the enzymatic activity is slow, with 
the tailspike cleaving two bonds per minute at physiological 
temperature. Given the short time required for infec¬ 
tion, it is unlikely that the enzymatic activity acts as a drill 
during infection. Rather, it is more likely that the enzymatic 
activity functions to facilitate the release of newly assem¬ 
bled phage from cell debris following cell lysis. A similar 
role has been proposed for the neuraminidase activity of 
influenza virus. 


Folding of the Tailspike Protein 

The unique biochemical properties of the tailspike protein, 
coupled with the well-defined genetics of the bacteriophage 
system, have made it an ideal subject for in vivo studies of 
protein folding. The tailspike folds in a multistep pathway, in 
which partially folded intermediates associate to form an 
SDS-sensitive protrimer. The protrimer then matures into a 
very stable SDS-insensitive trimer. The reason for this 
stability is that the central region of each subunit of the 
trimer is folded into a right-handed P-helix structure, and 
the C-terminal regions interdigitate to form P-sheet struc¬ 
tures. The ability to readily identify and quantify the ratio of 
partially folded and fully folded protein by the simple techni¬ 
que of SDS-PAGE analysis has allowed for the isolation and 
characterization of a series of mutants which destabilize 
the partially folded intermediates during folding, the tsf 
mutants. Analysis of these mutants has lent support to the 
idea that protein folding and assembly takes advantage of 
transient interactions between amino acid residues that do 
not play critical roles in the stability of the final structure. 
At restrictive temperatures the tsf mutant proteins form 
aggregates, indicating that aggregation occurs through the 
interaction of partially folded protein molecules. These 
studies also indicate that alterations in the lifetime of the 
protein folding intermediates can result in aggregation 
(33,84,104, 111). 

The tailspike protrimer represents a partially folded 
intermediate that can be isolated and studied biochemically. 
The tailspike protein contains eight cysteines per subunit, 
all of which are reduced in the mature folded tailspike 
trimer. Surprisingly, in the protrimer some of the cysteine 
residues are oxidized to form intermolecular disulfide bonds. 
The formation of these bonds in a folding intermediate 
may serve to facilitate subunit interactions or stabilize the 
protrimer during the reorganization required for successful 
folding (69). 


The Future 

The comparative study of the genetics of lamdoid phages 
has led to an appreciation of their evolutionary relatedness 
(chapter 27). It appears that this relatedness extends to the 
animal viruses as well. Comparative studies of the assembly 
pathways will serve to determine whether the pathways 
themselves are obligately conserved. For example, it may be 
that there are only certain strategies that will result in the 
formation of a biologically active, topologically closed viral 
capsid. 
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The Bacteriophage Mu 

LUCIANO PAOLOZZI 
PATRIZIA CHELARDINI 


M u was the first mobile genetic element identified in 
prokaryotes. Since its first isolation, in 1963, it has 
attracted the interest of many biologists. This interest is a 
consequence of its double nature: Mu is both a bacterio¬ 
phage and a transposon. 

Studies on Mu began when L. Taylor published a paper on 
a new temperate bacteriophage that showed the unusual 
ability to insert its DNA in multiple sites of the host 
genome. This integration was coupled with the induction of 
mutations in the host cell (236). The insertional mutagenesis 
phenomenon was thus described for the first time and it was 
shown that when sensitive bacteria (Escherichia coli K12) 
were infected with bacteriophage Mu, they exhibited 
increased mutation rates at many genetic loci. The prophage 
was always linked to the chromosomal site of the induced 
mutation. These properties were similar to those of the 
“controlling elements” postulated by Barbara McClintock 
(176), consisting of DNA moving from one position to 
another in the maize chromosome that resulted in modifica¬ 
tion and even suppression of the function of some genes. 

Soon after these interesting discoveries, phage Mu 
revealed other fascinating aspects: (i) an unusual DNA 
structure consisting of heterogeneous host sequences at 
both ends of the Mu chromosome (27, 48, 49, 110), (ii) an 
ability to alternate host range by inversion of a DNA segment 
(127, 247), (iii) a paradigmatic mode of DNA replication 
by transposition, (iv) formation of a variety of host DNA 
rearrangements in the bacterial chromosome including 
inversions, duplications, and deletions of adjacent genes, 
replicon fusions, and transpositions of host DNA segments 
(57, 59, 60, 61,160, 220, 241, 243), and (v) regulation, by DNA 
methylation, of the expression of the modification gene, 
mom (95, 96,102,119,123, 205). As a transposon, Mu shows 
another unusual peculiarity: its ability to exploit two trans¬ 
position mechanisms, replicative and nonreplicative, both 
of which are mediated by the same recombinase, the Mu 
transposase (the product of Mu gene A) (31). The Mu ability 
to promote genome rearrangements during its lytic cycle, 


in turn, was largely exploited in genetic manipulations for 
the construction of a large collection of bacterial strains, 
particularly in Ariane Toussaint and Malcolm Casadaban’s 
laboratories. As befitting an organism with such an impor¬ 
tant and varied life-style, there have been a number of 
previous reviews of Mu (153,185), including a chapter in the 
previous edition of this book, published in 1988 (92), and 
even an entire book devoted to phage Mu, which was pub¬ 
lished in 1987 (235). Here we strive to bring the reader up to 
date on our understanding of biology of this fascinating 
virus. 


Overview of Mu 
The Mu Life Cycle 

The life cycle of Mu starts with adsorption and injection of 
its linear, double-stranded DNA genome into a sensitive host 
(figure 30-1). The injected DNA is flanked by a small region 
whose length and sequence vary in each viral particle and 
which constitutes a vestige of the host chromosome where 
the phage previously developed. Together with the DNA, a 
phage-encoded protein is also injected, the Mu N protein 
(product of gene N). This protein is necessary to convert 
the linear form of the phage genome into a circular, non- 
covalently closed form. Subsequently, in a process known 
as nonreplicative transposition (which is mediated through 
the action of the transposase, the Mu A protein) the two 
strands of phage DNA, but not the bacterial sequences 
associated with them, are randomly inserted into the host 
chromosome. In about 1-10% of the integrated genomes 
the lysogenic cycle is established. In these bacteria, the 
phage DNA remains tightly associated with the host 
genome and is inherited by daughter bacteria in the course 
of cell division. The immune state, that is the prevention 
of superinfection by other Mu particles, is established in 
the lysogenic bacteria. 
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Quiescent phage 


Figure 30-1 Phage Mu life cycle. Redrawn from Sokolsky and Baker (217). 


In the majority of bacteria the integrated Mu DNA mole¬ 
cules carry on with their lytic cycle. The protein-A trans- 
posase, together with other factors, such as the Mu B 
protein and various host DNA-replication proteins, amplify 
these molecules, which are never free in the cytoplasm but 
instead are transposed to other sites of the bacterial 
genome. Besides copies of Mu DNA, capsid proteins are 
produced and the phage DNA is packaged by a “headful” 
mechanism into the Mu virion particle (see chapter 6). At 
the end of this cycle, which in Escherichia coli needs about 
55 minutes at 37 °C, each infected cell produces about 
100 phage particles, which are released by cell lysis. These 
particles are heterogeneous for one of two possible sets 
of tail fibers, which determine the host range for the next 
infection cycle and which are determined by the orienta¬ 
tion of an internal, invertible region (G-segment, also known 
as G-region or G-loop). Much of this chapter is devoted to 
examining the Mu life cycle in increasing molecular detail 
and references to the mechanisms overviewed in this and 
the previous paragraph will be provided there. 

The Mu Virion 

Bacteriophage Mu virions consist of an isometric icosahe- 
dral head, 54 nm in diameter, a knob-like neck, a 
contractile tail, a baseplate and six short tail fibers (2, 239). 
The DNA isolated from mature phage particles is linear, 
double-stranded DNA with a size initially estimated to be 
between 37 and 42 kb (25, 51, 170, 240). This mean value 
of about 39 kb is represented by the 36,717 bp of Mu DNA 


present in each particle (185) and 0.5-3 kb of attached, 
variable-seguence host DNA. The density of Mu DNA is 
1.71 g/cm 3 , which reflects a G-C content of 52.05% (169, 
184, 239). 

Electron microscopic analysis of Mu DNA isolated from 
virions after denaturation and reannealing, restriction frag¬ 
ment mapping, and DNA sequencing (9, 49, 130) reveals 
three distinct regions of packaged Mu chromosomes: the 
a-segment, the G-segment, and the (3-end. The a-segment is 
the majority of the genome—the part that is conventionally 
drawn to the left of the G-segment.The presence of a short 
(50-150 bp) host DNA segment at the left (c) end of the 
a-segment (9) was demonstrated by Southern blots (27). The 
G-segment is an invertible region of the Mu DNA that deter¬ 
mines host range. The |3-end is a short region of the genome 
to the right of the G-segment that contains the overlapping 
com and mom genes. Attached to the right end of the (3-region 
is 1-2 kb of bacterial DNA representing the site where the 
Mu DNA was previously integrated. When Mu virion DNA is 
denatured and renatured, this DNA appears as single- 
stranded “split ends." The presence of this DNA within Mu 
virions is a consequence of (i) the headful packaging 
mechanism employed by Mu, (ii) incorporation into virions 
of only Mu DNA that has inserted into the bacterial chromo¬ 
some, and (iii) a total Mu DNA length that is substantially 
less than that which may be theoretically packaged into the 
heads of Mu virions. The virus-encoded and genome-circu¬ 
larizing N protein is also packaged into the phage head. Both 
DNA and N protein are injected into the host cell upon 
adsorption (75). 
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Genetic Map, Physical Maps, and 
DNA Sequence 

By convention, the left and right extremities of the Mu 
genome are those that contain the c gene and the mom 
gene, respectively (figure 30-2). Sites att L (also known as L) 
and att R (also known as R) are also found at the left and right 
ends of the Mu DNA (figure 30-2) and define the junctions 
between the phage and the random host DNA sequences 
(120, 131). Between 1970 and 1985, 36 genetic structures, 
including genes, regulation sites, and att sites, were geneti¬ 
cally identified by various groups. The genetic map is the 
same for the prophage and mature Mu DNA. This contrasts 
with phage X (chapter 27), which has a different integration 
mechanism and consequently different chromosome ends 
depending on whether the phage DNA is packaged within 
the virion particle versus or into the bacterial chromosome 
(1, 9,168,192, 263). The maximum recombination frequency 
for Mu is about 1% (41, 262, 263,266). 

The genetic organization of phage Mu is, as for other 
bacteriophages, constituted by functional modules. For Mu 
these modules, whose locations in the genome are indicated 
in figure 30-2, include (i) an integration-replicative module, 
from “fittL” to the “transposition" region, (ii) a module consist¬ 
ing of semi-essential genes (the SEE region), (iii) a morpho¬ 
genesis module (including a region involved in programmed 
DNA inversion), and (iv) a module consisting of the corn- 
mom genes. Gene distribution is roughly colinear with that 
of the prophage form of bacteriophage X (246) (see “Mu 
and Mu-like Phage Evolution,” below). However, due to 
Mu’s double nature as bacteriophage and transposon, the 
Mu genome possesses a series of modules characteristic of 
viral-specific functions as well as a replicative module that 
confers on Mu its peculiar phage-transposon life-style. 
It is interesting to note that this last module can be aligned 
with the transposon Tn3 structure (128). 


The Integration-Replication Module 

The first module, from the left end of the Mu genome, is 
implicated in the processes of integration-replication of the 
phage chromosome. This is a DNA fragment of about 5 kb 
that is formed by the att L site to the left, and genes A and B 
to the right. In addition, the attR site, which is located at the 
opposite extremity of Mu DNA, is also involved in integra¬ 
tion-replication. Genes A and B are early genes, essential for 
Mu DNA integration and replication (58, 191, 266). Gene 
A codes for the phage transposase that is required for 
integration of the Mu genome into the host chromosome as 
well as for replication of the phage DNA during the lytic cycle 
(202, for a recent review, see 30; see also “The Mu Trans¬ 
posase," below). The B gene codes for an ATP-dependent 
DNA-binding protein that is required for efficient trans¬ 
position (36,175). In this region, the c and ner genes are also 
present, and are both involved in regulation of the lytic and 
lysogenic state. The c gene codes for the repressor protein 
(Repc) necessary to establish and maintain the lysogenic 
state and supply the cell with the superinfection immunity 
(73,107,147, 208, 275). The ner gene product (the Ner protein) 
negatively regulates the transcription from the promoters 
PcM (also known as Pc or Pc-2) and Pe, and seems essential 
for phage development (78, 79, 249, 266). 

The att sites, fittL and uttR, are two regions located at 
the left and right extremities of the phage DNA. Each is 
formed by three sites (not shown in figure 30-2): LI, L2, 
and L3 at the left end and Rl, R2, and R3 at the right end. 
These sites are recognized and bound by the Mu A-protein 
transposase. Interestingly, Repc, besides its function as 
transcriptional regulator, also competes with the trans¬ 
posase for binding to the two att sites. 

Likewise, the transposase is able to bind two sites of the Pe 
operator region, 01 and 02, which are also bound by Repc. 
These two operator sites, 01 and 02, in fact, constitute what 


Immunity 

Transposition 


▼ 


Late transcription activation 
I Lysis 


SEE 


Head and tail genes 


Bacterial 


_r\r 

c = A B kil, gam. sot, arm, gemAB, ClysDEHFGl TJK L MY NP QVW R S UU’S’ I? I 


DNA 


A A 

n 

oacieriai dina 


□ 

Y Y Y 




attL 


«-PcM 




Figure 30-2 Genetic map of the bacteriophage Mu. The arrow tips indicate the transcription starting points. 
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is described as a transposition enhancer (hereafter referred to 
as “enhancer” or, simply, E) which functions by interacting 
with the phage att sites, attL and attR (11) (reviewed below 
under “The Mu Transposase”), and which is also involved 
in the positive regulation of the transcription of genes ner, 
A, and B, the latter two products being intimately involved 
in Mu transposition. The same operator region also plays 
a critical role in inhibiting transposition by providing 
the binding target for Repc, which shuts off the transcrip¬ 
tion of gene A and therefore production of its transposase 
product (80). 

Demonstration of the functional autonomy of the 
integration-replication module is its ability to behave as 
a transposon if carried by a recombinant plasmid. These 
plasmids, called miniMu, also harbor a thermosensitive 
mutation in the repressor gene, cts62. At low temperatures 
they consequently behave as independent replicons due 
to Repc activitiy, but can be induced to transpose at the 
higher temperatures at which the mutant Repc protein is 
no longer functional (35). 


The Semi-essential Genes Module 

The region between 4.3 kb and 10 kb of the Mu genome, 
between the B and C genes, is described as "semi-essential 
early” (SEE). Note that this “C’gene and the Y’gene discussed 
in the previous section are found at distinct loci. The c gene 
is found outside of and to the left of both the B gene (also 
discussed in the previous section) and the SEE region. The 
C gene, on the other hand, is found at the extreme right 
end of the SEE region (figure 30-2). Little is known about 
the biological role of the SEE region although a number 
of functions which affect burst size, DNA replication, and 
other aspects of Mu behavior have been identified in it (for 
review see 195). These functions permitted the identification 
of various genes such as kil, whose expression is lethal 
for the bacterium, gam (by analogy with the gam gene of A,), 
sot (stimulation of transfection), arm (amplification of 
Mu replication), and, controversially (i.e., see “Negative and 
Positive Control of Transcription,” below), dm (control of 
immunity). 

The SEE region is transcribed as a polycistronic message 
that originates from the Pe promoter. Within this region, 
however, another promoter has been identified, the Pgem 
promoter, which constitutively transcribes the last two 
genes of the SEE region (55), gemA and gernB (the latter also 
known as mor), which together are organized as an operon. 
The GemA protein modulates the expression of various host 
genes and is responsible for decreasing host DNA gyrase 
activity and thereby promotes DNA relaxation of the bacte¬ 
rial genome (70,196). The GemB/Mor protein is involved in 
Mu late gene transcriptional transactivation (74, 172). 
As discussed below, under “Lysogenic Conversion,” it is 
likely that both genes impact gene transcription. 


The Modules for Morphogenesis 

The first two genes of this region (C and lys ) are not directly 
implicated in morphogenesis. The C gene, which defines the 
right end of the SEE (described above) codes for a positive 
regulator of late gene transcription while lys is involved in 
bacterial cell lysis and viral particle liberation at the end of 
the lytic cycle (see chapter 10). Indeed, 2iys mutants produce 
phages that are not released from the bacterial cell. A spe¬ 
cific enzymatic activity of Lys has not been shown experi¬ 
mentally (174), though its sequence affinities strongly 
suggest that it is a member of the true lysozyme family (185). 

Genes D-J, found immediately to the right of lys, 
are implicated in phage head-protein synthesis (73, 88). 
However, between the genes G and I —which is both in the 
middle of the D-J sretch of genes and in the middle of 
the Mu genome—is also a site that is required for rapid Mu 
replication (204). This site is efficiently cleaved in vitro and 
in vivo by E. coli DNA gyrase in the presence of enoxacin 
(203), and has been labeled as the Mu Strong Gyrase Site 
(SGS). The subsequent genes in this morphogenesis module, 
L through R, code for tail proteins. Protein N, coded by gene 
N that is found within this L through R stretch, is packaged 
into the phage head and, as noted above, is involved 
in genome circularization at the start of infection (see 
“Circularization of Infecting DNA,” below). Genes S, U, S', 
and 17', also found within the morphogenesis module, at 
its extreme right end, are specifically localized to an inver¬ 
tible sequence called the G-segment (reviewed below under 
“The Invertible G-Segment”). 

An additional module consists of the com—mom genes. 
The mom gene encodes an unusual DNA modification 
function that protects the viral genome against a wide 
variety of host restriction endonucleases (see also “Regula¬ 
tion of Mu Development,” below). Com is involved in post- 
transcriptional regulation of mom (124, 272). 

Promoters and Transcriptional Terminators 

The promoter PcM drives the transcription of the repressor 
gene, c, and, to date, is one of only two Mu promoter that 
drives left-oriented transcription (figure 30-2) (the other, 
momP2, is discussed below and in reference 229) Various 
promoters drive rightward transcription, for example Pe for 
the early region, a middle promoter (Pm), and three pro¬ 
moters for the late transcription, Plys, PI, and Pp, together 
with a promoter region called Pmom. The middle promoter, 
Pm, promotes transcription of the gemB/mor gene. GernB/ 
Mor activates C transcription and protein C, in turn, activates 
late gene expression. The other promoters include Plys, PI, 
and Pp which control late gene expression (167). Within the 
region transcribed from Pe, another promoter was localized, 
Pgem, which constitutively transcribes a small operon, gem, 
that is made up of two genes, gemA and gernB (reviewed 
above and below) (17,23,98,141,142, 205,207). 
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An inverted repeat (IR) is found upstream of the gemA 
gene and is centered near the initiation point of Pgem- 
directed transcription. This IR acts as terminator for the 
transcript that originates from the Pe promoter (56). Other 
terminator sequences include IRs that lie downstream from 
the gemB/mor gene (56) and downstream of the C gene (104, 
165). These sequences, respectively, are found just upstream 
of the promoters Pm and Plys in figure 30-2. 

Mu Genome Sequencing 

New data on Mu genetic organization and the identification 
of new coding regions have been obtained following the total 
sequencing of phage DNA (185). Fifty-six probable phage 
Mu genes (open reading frames) were identified using 
E. coli codon usage as a search probe. This number compares 
to the 36 genes that have been genetically identified. The 
sequence analysis shows that it was possible in many but 
not all to correlate the genetically defined genes with open 
reading frames in the sequence. The majority of the 56 
genes have AUG start codons. Three genes—G, gene 45 (Q?), 
and gin —have GUG start codons, and one (gene c, also 
known as gene 1) is known from an earlier work to use a 
UUG start (68, 254). The largest gene, encoding the putative 
tail-length tape measure protein, was not previously identi¬ 
fied genetically. It lies between genes M and N in figure 30-2. 
The start sites for gene T, the major head subunit (which 
is found, as shown in figure 30-2, between genes I and /), 
and gene L, the tail sheath subunit, were confirmed by 
N-terminal sequencing. Many of the predicted Mu genes 
encode presumptive proteins that show similarity to pre¬ 
dicted proteins from a prophage in Haemophilus influenzae, 
as well as to predicted proteins encoded by prophage genes 
in other bacteria. 


Fate of Infecting DNA 

Once in the cell, the Mu DNA is circularized via the action 
of the N protein and, as seen with other temperate bacterio¬ 
phages, Mu can choose between the lytic and the lysogenic 
cycle (figure 30-1) (see chapter 8). However, in constrast 
to the other temperate bacteriophages, for phage Mu DNA 
integration (also known as transposition) is a prerequisite 
for both cycles (161). Two transposition mechanisms take 
part in the Mu life cycle. The first is a nonreplicative (or 
conservative) transposition that mediates the integration 
of the infecting phage DNA into the host genome (6, 33, 
90, 158). The second mechanism, called replicative trans¬ 
position, is exploited instead to replicate the integrated 
phage DNA. In the latter mechanism one daughter Mu 
genome remains in place and the other daughter is repli¬ 
cated while integrated into a second site of the bacterial 
genome (35). 


Circularization of Infecting DNA 

Circularization of infecting DNA within the host cell is 
a rather common phage strategy (see chapter 7) that is used 
to protect phage genome termini from nuclease attacks, to 
convert the linear genome to an integrative precursor, or to 
represent the replicative form of the genome. Various 
mechanisms allow the achievement of this objective, such 
as covalent closure of the phage DNA (single-stranded tails 
at each 5' or 3' end can be complementary and therefore 
cohesive), recombination between redundant terminal 
sequences, or by means of a protein bound to the extremities 
of the genome that converts the linear phage DNA into 
a noncovalently closed form. Circularization of Mu DNA 
exploits this last mechanism. 

The existence of a phage-encoded protein injected into 
the bacterium during adsorption that was responsible for 
this noncovalent closure was suggested by Kahmann et al. 
(122). They observed that the Mu DNA was scarcely infective 
in bacteria that had been made competent by treatment with 
Ca 2+ . However, infectivity could increase in bacterial hosts 
lacking exonuclease V Nevertheless, transfection still was 
very low compared, for example, with that obtained with A, 
DNA: about 1000-fold lower. The infectivity could increase 
by about 2 orders of magnitude, however, if the Mu sot gene 
was expressed in the recipient bacterium. The sot gene is also 
called gam, and it may inhibit a host nuclease (7). These facts 
were explained with the hypothesis that proteins, normally 
injected with the Mu genome and necessary for a positive 
outcome of the transfection, could be eliminated during the 
phenol extraction of Mu DNA (122). This hypothesis was 
sustained by the fact that the DNA extracted from the viral 
particles broken by freeze-thaw methods was about 1000 
times more infective than DNA extracted with phenol (38). 
In addition, treatment of DNA made by freezing and thawing 
with proteinase K or with pancreatic DNase resulted in a loss 
of infectivity. 

Extension of the above hypothesis is that the infective 
form of Mu DNA is a DNA-protein complex and that the 
protein portion of this complex consists, at the very least, of 
Mu protein N. The following observations support this exten¬ 
sion: (i) CsCl-purified DNA was found to be noncovalently 
associated with a 65 kDa polypeptide, but no protein was 
found in phenol-extracted Mu DNA (38). (ii) A supercoiled 
form of infecting Mu DNA was isolated by Harshey and 
Bukhari (93). Upon treatment with pronase, phenol or 
sodium dodecyl sulfate, however, this supercoiled DNA was 
converted to a linear, Mu-length form, indicating that the 
circle was not covalently closed but instead held together 
by proteinaceous material, (iii) A protein noncovalently 
bound to Mu DNA was identified in minicells infected 
with this phage (209). The 64 kDa polypeptide was found to 
co-sediment with Mu DNA through a sucrose gradient, 
(iv) Implying that complementary DNA ends are not involved 
in the interaction, Mu DNA circularization does not require 
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removal of the E. coli DNA sequences that are attached to 
both ends of the Mu genome in the viral particle (209). 
(v) Antiserum to the protein purified from viral particles is 
specific for the Mu N product (75). 

Conservative Transposition 

The term “conservative transposition” (also known as 
nonreplicative transposition) means that, upon infection, 
both strands of the Mu DNA double helix are integrated 
without any previous DNA replication. Three kinds of 
experiments allowed determination of this mechanism. In 
the first experiment, Liebart et al. (158) showed that infec¬ 
tion of isotopically labeled “heavy” cells containing a small 
plasmid resulted in the formation of some Mu-containing 
plasmid DNA of a density consistent with integration 
without replication. The second, carried out by Akroyd and 
Symonds (6), found that a cell transfected with an artificially 
constructed heteroduplex Mu DNA molecule, containing 
one wild-type strand and one mutant strand, gave rise to 
a mixed burst with an approximately equal number of 
wild-type and mutant phage. The interpretation is that 
both strands of the infecting DNA were integrated but then 
segregated during subsequent DNA replication. The third 
experiment, by Harshey (90), showed that infecting phage 
DNA that was fully methylated (from phage grown on a 
strain that overproduces the E. coli dam methylase) was, 
when injected into a dam-mutant host, integrated in a fully 
methylated form. 

Lysogenic and Lytic Cycles 

The integrated Mu DNA is committed toward two alternative 
cycles: lysis or lysogeny. Only in a minority of cases is the 
lysogenic cycle established, whereas the lytic (or productive) 
cycle is predominant in most infections. 

Frequency of Lysogenization 

Under conditions in which all the cells of a bacterial 
population are infected (high multiplicity of infection), 
the Mu lysogenization frequency, amongst survivors, varies 
between 1% and 10%(108). More stringent determinations 
have been obtained with the Mu mutant cts Amp in which 
the ampicillin-resistant gene is inserted into the G-segment 
(154) (see figure 30-2). The frequency of ampicillin-resistant 
bacteria, obtained upon infection with this phage and 
due to its stable integration (lysogenization), is on the order 
of 0.1% per phage particle. 

Superinfection Immunity 

In the lysogenic bacterium the constitutive expression of 
the c gene product, Repc, makes the cell immune to Mu 


superinfection. This state is stable and therefore the 
frequency of spontaneous prophage induction is very low, 
about 10 (159). Neither chemical nor physical factors are 

known to induce the prophage. However, as has also been 
observed for phage X, prophage transfer by conjugative 
mating induces a lytic cycle in the recipient bacterium 
(zygotic induction) that results in cell death. The zygotic 
induction can be prevented, however, by using a Mu- 
immune recipient bacterium. On the other hand, the Mu vir 
mutant, which carries a mutation in the c. repressor gene (68, 
253), kills a Mu lysogen upon infection but with a mechan¬ 
ism that is totally different from the one used by X vir (see 
“The Mu Repressor,” below). Experimentally, induction of 
lysogenic bacteria is obtained using phages carrying a ts 
mutation in the repressor gene, such as Mu cts 62 (107), 
which is the most popular. These lysogens are stable at 30 
°C and are induced at 42 °C. 

Random Integration of Mu DNA 

In lysogenic bacteria, Mu DNA is inserted into randomly 
distributed sites. About 2-3% of lysogens carry auxotrophic 
mutations due to prophage integration in genes impli¬ 
cated in the corresponding biosynthetic pathways (234). 
Consistent with the randomness of this integration, Bukhari 
and Zipser (26) describe the mapping of 75 independent 
insertions of Mu DNA in the lacZ gene alone, showing 
that each integration event occurred in a different site. 
From these facts we derive the classical notion of Mu 
random integration. Many other investigations reinforce 
these observations (47, 50; for a review see 91). However, the 
random character of Mu integration does not exclude the 
existence of integration hot spots, such as the malK-lamB 
region of E. coli (210, 225). Other factors can also modify 
integration frequency. It has been observed, for example, 
that Mu integration in lacZ is reduced when that gene is 
actively transcribed (60). Adding to the randomness of Mu 
intregration, the Mu genomes insertion can occur in either 
of the two possible orientations of the phage genome 
(109). Furthermore, analysis of the bacterial DNA flanking 
the inserted Mu DNA revealed the existence of 5 bp direct 
repeats of the target-site DNA. These 5 bp vary from one site 
to another and are a consequence of the Mu integration 
mechanism (30). 

Proteins implicated in Mu lysogenization are those 
required for phage DNA integration. These phage-coded 
proteins are Mu A (the phage transposase) and Mu B, which 
stimulates integration frequency. Gene A amber mutants do 
not lysogenize and do not integrate their DNA into the 
bacterial host, as shown by the behavior of j2 P-labeled 
gene A amber mutants (34, 161). Less evident is the role of 
Mu B in the process. Mu B amber mutants show an approx¬ 
imately 3-fold reduction in lysogenization frequency when 
compared with the wild-type phage (190). However, these 
mutants are at least 100-fold defective for replication (37,39), 
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indicating a different impact of Mu B on integration than on 
replication. 

Lytic Replication 

The lytic cycle is the destiny of most integrated Mu DNA 
molecules. This cycle can be studied either in infected 
bacteria or, more easily, upon induction of lysogenic bacteria 
carrying a thermosensitive mutation in the repressor gene. 
Even though Mu DNA replication cannot be separated from 
a general model of transposition, in this section we will 
examine only the main characteristics of the replication 
process, whereas analysis of the transposition mecha¬ 
nism will be treated in the next section (for a recent review 
see 31). 

Mu DNA replication starts between 6 and 8 minutes after 
induction (255, 264) and continues for about 40 minutes. 
Replication is semiconservative (201) and only two phage- 
encoded proteins are implicated: Mu A and Mu B. In this 
process, the integrity of the Mu terminal sequences is 
essential (246, 256). During the lytic cycle, various rear¬ 
rangements, such as deletions, inversions, replicon fusions, 
and transpositions are produced in the host cell. These 
rearrangements are a consequence of the transpositive 
replication itself. In addition, free Mu DNA molecules are 
never found in the cell, although circular forms of DNA 
consisting of Mu DNA associated with heterogeneous 
sequences of bacterial genome can be observed. 

The critical experiment showing that Mu DNA replica¬ 
tion is peculiar compared with that of other bacteriophages 
was performed by Ljungquist and Bukhari (159). These 
authors showed that upon induction of a Mu lysogenic 
strain, Mu DNA replication occurs in situ without excision 
of the prophage from the bacterial chromosome. Restric¬ 
tion endonuclease digestion, which cuts both bacterial and 
phage sequences, and the separation on agarose gel of the 
fragments obtained, showed DNA fragments containing 
both phage and bacterial DNA. These fragments span the 
junction between the prophage extremities and the bacteria 
DNA. The persistence during the entire lytic cycle of the 
junction fragment seen in the lysogen suggested that 
the Mu DNA was replicated in situ, that is without leaving 
the bacterial genome. In addition, during the lytic cycle 
other junction fragments between Mu and bacterial DNA of 
various sizes accumulate, providing evidence that in situ 
replicated DNA is integrated into new sites of the genome. 

Packaging of Mu DNA 

Mu DNA packaging is an oriented process starting with the 
recognition of the pac site (87), which resides 32-54 bp to 
the right of the left end of the Mu chromosome, between the 
transposase binding sites LI and L2 found within attL (89). 
The cut in the DNA at the left end occurs in the host DNA 
flanking the Mu insertion site at about 100-200 bp to the 


left of the pac site. Therefore, 50-150 bp of this bacterial 
sequence are encapsidated in the phage particle. The 
packaging mechanism is “headful,” as deduced by Bukhari 
and Taylor (25) who observed that insertions in Mu DNA 
were compensated by a reduction in the bacterial DNA 
length that is associated with the Mu DNA. That is, the 
total length of encapsidated DNA depends on the size of the 
phage head and, since more than the 37 kb of Mu DNA 
can be inserted in the phage capsid, the additional DNA 
consists of the bacterial DNA flanking the integrated 
Mu genome, with most of that DNA flanking the right end 
of the Mu genome. 

Little is known about Mu packaging at the molecular 
level. It is still not known what determines the pac specificity, 
the cut specificity to the left of the c gene, or the proteins 
implicated in pnc-site recognition. Although mutants in 
genes E and I do not make DNA molecules of mature Mu 
length, there is no proof of a direct role of their gene products 
in Mu packaging. Partially addressing this dearth of knowl¬ 
edge, an in vitro system for maturation and encapsidation 
of the Mu-like phage D108 was set up by Burns et al. (28). 
In this assay a crude extract of cells in a late phase of Mu 
lytic cycle was able to mature and encapsidate the D108 
DNA starting from the Mu packaging machinery. The assay 
revealed that ATP is necessary for the reaction and that the 
D108 transposase inhibits this process. Replication is not 
required for either Mu or phage D108 DNA packaging. 

Conservative and Replicative 

Transposition 

A peculiar characteristic of Mu bacteriophage is, as already 
stated, its ability to exploit two alterative transposition path¬ 
ways, which characterize different stages of its life cycle. 
A nonreplicative mechanism characterizes the integrative 
transposition that mediates the phage DNA integration 
in the host genome (also known as conservative transposi¬ 
tion as well as integrative transposition), whereas replicative 
transposition allows the generation of new copies of the 
viral genome ultimately found within the phage burst 
at the end of the lytic growth. In this section we consider 
the molecular mechanisms of Mu transposition in greater 
detail. 

Transposition 

The transposition mechanism is formally described in 
the Shapiro model (220). A staggered cut of the target DNA 
sequence is made at two specific phosphodiester bonds 
found on opposite DNA strands, one located adjacent to 
each of the two phage DNA ends (LI and Rl) (figure 30-3). 
The resulting 3'-0H groups are joined via transesterifica¬ 
tion to two phosphodiesters placed 5 bp apart on the two 
strands of the target DNA. This is a process known as the 
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Figure 30-3 The Shapiro model for transposition. 

“strand transfer reaction” which gives rise to what is 
described either as a “strand transfer” or “Shapiro” intermedi¬ 
ate. The free 3'-OH sites, generated between the host and 
the Mu DNA, are needed to initiate continuous-strand repli¬ 
cation. Covalent extension of these ends, generated during 
transposition, would explain why DNA polymerization 
occurs only toward and then into the Mu DNA rather than 
also outward into the bacterial chromosome. This replicative 
process, which is mediated by host proteins, yields a trans¬ 
position product in which each phage end is covalently 
attached to both the flanking DNA and the target DNA. 
Subsequent separation of the DNA strands of the phage 
double helix, as occurs during semiconservative DNA 
replication, ultimately separates the donor and target 
DNA molecules. 


The transposition process was reconstructed in vitro 
by Mizuuchi using partially purified proteins (179). The 
requirements of that system are a supercoiled miniMu 
substrate, which contains the left end (flttL) and the right 
end (rtttR) transposase binding regions and the 01 and 02 
enhancer elements. Also required are the Mu protein-A 
transposase, the Mu B protein, two host-encoded acces¬ 
sory proteins (HU and IHF), and a target DNA molecule into 
which miniMu integration will occur (153, 180). The Mu 
nucleoprotein (also known as transpososome) complex 
consists of all these elements and its composition is now 
well characterized (for a review see 31). 

Conservative Transposition 

It is still poorly understood how Mu performs nonreplicative 
(also known as conservative or integrative) transposition. 
Phage DNA enters the bacterial cell in a linear form and 
is circularized by the N protein (75), but the role of this 
protein-DNA complex on integrative transposition has not 
been determined. In addition, both strands of phage DNA, 
but not the host sequence flanking the infecting DNA, are 
incorporated into the new host. Early in vitro studies of 
Mu transposition suggested that the outcomes of the two 
transposition pathways result from alternative processing 
of the Shapiro intermediate (43, 44), that is conservative 
transposition could originate from the repair of the strand 
transfer intermediate, coordinated with deletion of the 
bacterial DNA carried by the pre-integration circularized 
phage genome, to generate a simple insertion. 

Recently the analysis of Mu B mutants—which are able 
to stimulate integration to a comparable extent as wild-type 
Mu B but are unable to support the formation of the Shapiro 
intermediate during in vivo replication—suggested that 
nonreplicative and replicative transposition in phage Mu 
may diverge before formation of the Shapiro intermediate 
(216). In addition, using both gyrase-inhibiting drugs and 
gyrase mutants, Sokolsky and Baker (226) showed, as con¬ 
sistent with previous studies, that gyrase inhibition causes 
severe defects in replicative transposition since complexes 
between the Mu protein-A transposase and Mu DNA, and 
thus transposomes, fail to form. By contrast, gyrase activity 
is not essential for conservative transposition (226). 

Replicative Transposition 

Contrary to what happens during integrative transposition, 
recombination steps during replicative transposition are 
well characterized. The initial step for transposition requires 
the assembly of higher order protein-DNA complexes, the 
transpososomes, in which the two phage DNA extremities, 
a transpositional enhancer (i.e., the 01 and 02 binding 
sites) also called IAS (internal activator sequence), which 
is reviewed below, and a tetrameric Mu protein-A trans¬ 
posase take part. In addition, a number of protein cofactors 
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Figure 30-4 The transposase targets. The att site, located at the phage extremities, and the enhancer sequence are 
recognized and bound by the phage transposase. Each att site is formed by three binding sites for Mu A (see “Replicative 
Transposition”). These structures are also recognized by the phage repressor, which shows overlapping binding specificity 
with the transposase. 


participate, including the Mu protein B, which serves as 
both ATPase and target DNA activator. 

The att sites, to which the protein-A transposase 
binds, reside at the two extremities of the phage genome 
(figure 30-4). Each att site, attL and ntiR, is formed by three 
transposase-binding sites, LI, L2, L3 and Rl, R2, and R3 
(40). These three binding sites are related by a 22 bp con¬ 
sensus sequence, 5'-GTTTCAYNNRAARYRCGAAAR(A/C), 
that shows no obvious internal symmetry (46). Binding of 
the protein-A transposase to these sites results in bending 
of the Mu end DNA by approximately 60-90° (3, 53). 
The three binding sites of attL are all oriented in the same 
direction but have a different spacing: LI and L2 sites are 
separated by about 80 nucleotides while L2 and L3 are 
rather contiguous. On the contrary, the three att R sites 
are adjacent, but the R 3 site is oriented in the opposite direc¬ 
tion with respect to Rl and R2. The binding affinity of the 
protein-A transposase also appears to be greater for LI, L3, 
R3 sites, as determined by DNasel footprinting of the A 
protein on the linear ends. By contrast, Rl and L2 are weak 
and R2 intermediate (11, 150, 276). On the other end, foot¬ 
printing of the core typel complex (see “Transposition 
intermediates”) showed that A protein covers sites LI, Rl, 
and R2 (150). 

Sites Ol and 02 of the operator, to which the Mu Repc 
protein binds, contain two clusters of Mu protein-A binding 
sequences (IAS sequence) (181). The binding affinity of the 
A-protein transposase for IAS is lower than the affinity of 
the Mu A protein for the two att sites, attL and attR (156). 
The att sites, however, are implicated in a complex circuit of 
interactions with the transpositional enhancer sequences, 
01 and 02 (11,184), as discussed below (particularly under, 
“The Transposition Intermediates”). 

There are two important aspects to the transpositional 
enhancer’s functionality. The first is the integrity of the 
whole IAS region. In fact, when the two hemisites are 
separated by means of digestion with restriction endonu¬ 
clease, the stimulating effect is no longer observed even 
if a high concentration of each of the two segments, Ol and 
02, is present. The second aspect that characterizes the 


transpositional enhancer's functionality is the DNA bending 
due to binding of IHF (a host-encoded accessory protein) to 
the site located between 01 and 02 (215). Actually, in vitro 
experiments with miniMu showed that the stimulating 
effect of the transpositional enhancer is mainly observed 
when complexes between the L and R extremities of the Mu 
chromosome (attL and att R, respectively) are formed, while 
the assembly of complexes either with a couple of R-ends or 
with one of the L-ends does not account for the presence or 
absence of the transpositional enhancer. Therefore, the 
transpositional enhancer helps to avoid an incorrect pairing 
of Mu-extremity L- and R-ends (259) and its function could 
be to stabilize the transition state from what is known as a 
three-site synapse—between the L-end, the Enhancer, and 
the R-end—which is called an LER complex (182). 

This model is supported by the observation that the trans¬ 
positional enhancer becomes dispensable once the trans- 
pososome is formed (183, 231). However, the transpositional 
enhancer remains associated with the att sites even 
after strand transfer has been completed (as shown in the 
type 2 complex presented on the far right of figure 30-6). 
Perhaps the sequestration of the transpositional enhancer 
within the LER complex prevents the Mu Repc protein from 
binding to the enhancer, thus signaling a commitment to 
transposition (201). 

The Mu Transposase 

The Mu A transposase is a complex protein with a modular 
organization (figure 30-5). In solution it is a monomer of 
663 amino acids with a molecular weight of 75 kDa (94) 
and is able to bind target sites on phage DNA. However, it is 
in a tetrameric form that it promotes the DNA cleavage 
and joining reactions of transposition (reviewed in 31, 152, 
180). All six att sites (L1-L3 at the left end and R1-R3 at the 
right end) and the two operator sites, 01 and 02 (which 
constitute the transposition enhancer), are bound by the 
Mu A protein in its monomeric form. The Mu A tetramer, by 
contrast, binds only to three of the att sites: LI, Rl, and R2. 
Partial proteolysis allowed the identification of the various 
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protein-A domains (189). The N-terminal domain is respon¬ 
sible for DNA binding and contains motifs for recognition 
of the two types of DNA sites bound by the transposase: 
the phage extremities (attL and attR) and the enhancer site 
(01 and 02). In particular, domain la binds the enhancer 
site and domain I(3y binds the two att sites. 

The central domain, domain II, contains in its N-terminal 
proximal part (subdomain Ila, which is not explicitly shown 
in figure 30-5) three amino acid residues: Asp, Asp, and Glu 
(DDE) at positions 269, 336, 392, respectively. These residues 
are essential in the strand-cleavage and strand-transfer 
transposition steps and constitute the catalytic core of the 
transposase enzyme (14,136,143). Mechanistically, the three 
residues’ function is coordination of divalent metal ions 
necessary for catalysis (162, 268). Subdomain 11(3 (also not 
shown explicitly in figure 30-5) has a large positive charge 
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potential (214) and has been implicated in metal-assisted 
assembly of the protein-A transposase tetramer. 

The C-terminal domain, or domain III, is also required 
for assembly of the transposase tetramer and probably for 
its chemical competence. Domain Ilia has both nonspecific 
DNA binding and cryptic nuclease activity (16, 155, 271). 
Domain 111(3 interacts with the Mu B protein, promoting 
strand transfer to the target DNA (15,155, 270). Domain III|3 
also interacts with the host-encoded chaperone, ClpX (157), 
which is involved in disassembly of the transpososome after 
transposition and prior to replication (145). 

The Role of Mu B 

Mu B, a protein of 312 amino acids (figure 30-5), is an 
ATP-dependent DNA binding protein that is able both 
to capture the target DNA and to interact with the other 
complexes of the transpososome (186). Mu B acts as an 
allosteric activator of the Mu A transposase, although 
the molecular mechanism by which Mu B stimulates the 
transposase is unknown. Williams and Baker (267) showed 
that Mu B activates transposition by stimulating the reaction 
step between cleavage and joining that is otherwise slowed 
by the 3'-flanking strand, that is the bacterial DNA locat¬ 
ed between the integrated Mu DNA ends and the sites 
of DNA cleavage (these 3' ends are shown in figure 30-3 
immediately above the label “Cleaved intermediate”). 

Mu B is formed by two globular domains, of which the 
N-terminal domain (25 kDa), which contains what is known 
as an AAA + ATPase motif (189), shares structural similar¬ 
ity with the N-terminal domain of the E. coli helicase, 
DnaB (112), and shows nonspecific DNA binding activity 
(figure 30-5). The 11 kDa C-terminal domain (237) plays an 
important role in the protein-DNA and protein-protein 
contact needed to regulate transposition (40). Peptide bind¬ 
ing experiments revealed that the region of Mu B that binds 
to Mu A lies between residues 217 and 235, a segment 
included in the N-terminal part of the domain II. In vivo, 
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Figure 30-6 Transpososome assembly during Mu replicative transposition. Redrawn after Jiang et al. (113). The process is 
described in the text, under “Replicative Transposition.” 
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proteins lacking residues 295-312 are able to promote 
integrative transposition but, interestingly, not replicative 
transposition (34, 216). 

Mu B is also involved in the process called “target immu¬ 
nity,” namely the inactivation of DNA sites proximal to the 
transposon as potential transposition targets and a phe¬ 
nomenon shared by many transposable elements (42). Mu 
uses a mechanism that favors its insertion in sites located 
10-25 kb from the original location of the phage genome 
(164), protecting an entire supercoiled domain of the bacte¬ 
rial chromosome. This allows Mu to avoid having its genome 
inactivated by an intramolecular DNA strand reaction. 
Although the process has not completely elucidated, a key 
step is the formation of Mu B oligomeric complexes, which 
are formed by the protein binding AT-rich DNA regions (3, 
83, 84). Through studies employing total internal reflection 
microscopy, which permits examination of Mu protein B 
behavior on a single DNA molecule, Greene and Mizuuchi 
(85) showed that, after a stochastic nucleation event, assem¬ 
bly of Mu B oligomeric complexes occurs by a mechanism 
involving the sequential binding of small units of MuB to 
form large polymeric complexes. The Mu A-protein trans- 
posase then stimulates the Mu B ATPase function, causing 
dissociation of the Mu B protein from the DNA. One model 
predicts formation of a multivalent complex between 
the tetrameric form of the Mu A protein and the Mu B 
oligomer, which thereby catalyzes processive disassembly 
of Mu B from the potential DNA target (83,84). Consequently, 
Mu B accumulates only in those DNA regions not bound 
by Mu A (3, 83, 84). The DNA decorated with Mu B oligomers 
is then recognized by Mu as a potential target for its genome 
insertion, whereas the naked DNA is not recognized (15). 
Thus, Mu B dissociation from transpososome proximal sites 
makes these DNA regions inaccessible as possible targets 
for the DNA strand transfer reaction. 

The Transposition Intermediates 

Initiation of phage Mu DNA transposition requires assembly 
of higher order protein-DNA complexes called transposo- 
mes as identified by in vitro studies (31, 44, 179). In the 
presence of a target-DNA substrate such as a supercoiled 
miniMu plasmid, the phage-encoded A protein, bacterial 
products Hu and IHF, and an appropriate divalent ion, the 
LER initial complex is formed (figure 30-6). In this struc¬ 
ture, through complex protein-protein and protein-DNA 
interactions, the Mu left (L/attL) and right (R/affR) ends and 
the enhancer element (E) are gathered (259). By analyzing 
the events preceding LER complex formation, Pathania 
et al. (201) showed that through Mu A-protein transposase 
the transposition enhancer interacts solely with att R and 
that this event precedes the entry of attL into the synaptic 
complex. 

LER is an unstable complex that, in the presence of 
Ca 2+ , is quickly converted into the first stable complex, the 


transpososome type 0 (183, 258), where the two Mu ends 
are engaged within the active site of the transposase 
(figure 30-6). In this structure the enhancer is no longer 
associated with the ends and Mu A has tetramerized (32). 
In the presence of magnesium, the 3' ends of Mu DNA are 
nicked and the Type 0 complex is converted into a type 1 
complex, also known as a cleaved donor complex (45, 232). 
The two subunits within the Mu A tetramer, which are 
associated with sites LI and R1 (sites that undergo cleavage), 
provide in trans their DDE amino acid residues towards the 
strand cleavage/transfer reaction. 

The addition of MuB, ATP, and target DNA results in 
the formation of type 2 complex (figure 30-6). This is also 
known as a strand transfer complex (STC) and is the product 
of the transfer of the 3' ends of the Mu DNA to a target DNA 
molecule (32, 45, 232). In the absence of target DNA, Mu B 
and ATP stimulate intramolecular strand transfer, with the 
3' ends of Mu DNA transferred into a new DNA site of the 
same donor molecule (15,175). 

The Transpososome-Replisome Transition 

Nakai et al. (188) describe an assay based on that of Mizuuchi 
(179 and described above) that reproduces in vitro the transi¬ 
tion from transpososome formation (as discussed imme¬ 
diately above) to the replication of Mu DNA, which occurs 
during the Mu lytic cycle. The transition is thought to resem¬ 
ble homologous recombination with strand exchange giving 
rise to DNA replication. The strand exchange step requires 
only five components: Mu protein-A transposase, the Mu B 
protein, HU (a host factor), supercoiled Mu DNA, and a 
target DNA substrate. A total of eight purified E. coli replica¬ 
tion proteins are then employed, along with the Mu strand 
exchange product (labeled “strand transfer intermediate” in 
figure 30-3), to effect replication of the Mu chromosome in 
the course of integration into the target DNA substrate. 

It has been observed that the Mu A protein (the transpo¬ 
sase) remains so tightly bound to the DNA that it blocks the 
action of the E. coli replication proteins necessary to repli¬ 
cate the strand transfer intermediate (187). After removal 
of Mu A, these E. coli proteins can replicate the strand trans¬ 
fer intermediate, forming what is known as a cointegrate 
(figure 30-6) that consists of two daughter Mu genomes, 
one integrated in the original (donor) position and the 
second integrated at a second (target) position (or DNA). The 
Mu B protein and the E. coli HU protein have also been found 
to be loosely bound to this DNA making up the strand trans¬ 
fer intermediate (151). Therefore, the type 2 (strand transfer) 
complex (STC) must be destabilized before replication can 
occur (157). In fact, after phenol extraction of the transpo¬ 
sosome, some of the replication proteins can enter the 
replicative forks: under these conditions the E. coli DNA pol I 
catalyzes a limited strand displacement synthesis (116,144). 

STC conversion to the cointegrate occurs if, besides the 
eight proteins, a partially purified fraction of the host 
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enzymes (Mu replication factor, MRF) (144) is added. How¬ 
ever, no cointegrates are formed when E. coli DNA Pol III, 
DnaB, and DnaC are omitted from the reaction system. 
Therefore, the transpososome bound to the template impo¬ 
ses a strict requirement for both MRF and the specific 
replication proteins in order to initiate Mu DNA synthesis. 
The MRF complex was thereafter separated into two frac¬ 
tions, MRFa and MRFfl each formed by many components 
(187). MRFa includes the chaperone protein, ClpX, and 
other as yet unidentified factors called MRFa2. 

The E. coli ClpX protein is involved in reactivating 
damaged proteins (224) and can associate with E. coli ClpP 
protein (a serine peptidase) to form the ClpXP protease 
complex that is involved in protein degradation (86). The 
ClpXP complex degrades the Repc (the Mu repressor) and 
thereby induces the lytic cycle. It also is required for 
in vivo Mu replication (177). This protein is responsible 
for the first steps in the transition from transpososome to 
replisome. The ClpX protein recognizes a specific peptidic 
sequence at the A-protein transposase C-terminal extremity 
(domain III, figure 30-5) and, through its unfolding activity, 
remodels the strand transfer complex (at this point, STC1) to 
form another more fragile complex, called STC2 (145). In this 
complex, the phage extremities are still maintained in a 
synaptic state. The Mu A protein is activated so that it 
can recruit crucial host factors needed to initiate Mu DNA 
synthesis by specific replication enzymes (145). Together 
with the unidentified host factors known as MRFa2, the 
transpososome’s role is the pre-primosome assembly at the 
level of Mu DNA forks. MRF P consists of the PriA, PriB, and 
DnaT proteins that make up the E. coli primosome (115). Its 
role in vivo was confirmed by the fact that Mu cannot give 
rise to its lytic cycle or replicate its DNA by transposition in 
knockout priA (115) or dnaT (218) mutants. It was hypothe¬ 
sized that these primosome components promote the pri¬ 
mosome assembly for phage Mu at the site of homologous 
strand exchange (188). 

The transition steps from transpososome to replisome 
could be: (i) The molecular chaperone ClpX converts STC1 
to STC2 (types 1 and 2 in figure 30-6), altering the confor¬ 
mation of the transpososome. (ii) MRFa2 then displaces 
the transpososome to assemble the prereplisome at the Mu 
forks to form what is known as strand transfer complex 3 
(STC3). (iii) PriA, a component of both MRF(fand the E. coli 
primosome, binds to the forked DNA structure created by 
strand exchange (the strand transfer intermediate) and 
begins the assembly process of a replisome at one Mu end. 
PriA promotes the assembly of the replisome, preferably 
starting from the left end as occurs during in vivo Mu repli¬ 
cation. The mechanism that determines which Mu end 
is used to initiate DNA synthesis is not yet clear, (iv) PriA 
then assembles a prereplisome complex by recruiting E. coli 
PriB, DnaT, and DnaB-DnaC complexes. In this process, 
DnaB must be bound to single-stranded lagging-strand 
template—which is not to be confused with the “lagging 


strand" observed in the formation of Okazaki fragments in 
systems that possess replication forks. To create this binding 
site, PriA unwinds duplex DNA by translocating 3' to 5' 
along this template. Once bound to the DNA, it attracts 
primase to form a primosome which catalyzes primer syn¬ 
thesis for lagging strand synthesis. Meanwhile, E. coli DnaB 
promotes binding of the DNA pol III holoenzyme to complete 
replisome assembly (188). 

Regulation of Mu Development 

Upon infection of a sensitive host or induction of a lysogenic 
bacterium, Mu DNA is transcribed by the bacterial RNA 
polymerase in a cascade event. This transcription is strongly 
asymmetric and unbalanced in the rightward direction since 
only the repressor gene, c, is leftward transcribed (i.e., from 
the promoter, Pc, also known as Pc-2 or PcM: figure 30-2). 
Transcription of bacteriophage Mu overall can be divided 
into three phases: early, middle, and late. 

Early Transcription 

Early transcription starts about 1.5 minutes after induction 
of a Mu lysogenic strain and lasts about 6 minutes. Hybri¬ 
dization studies have indicated that, during this phase, this 
transcription is directed rightward on the conventional Mu 
map (13, 250, 265). Transcription begins from Pe (141) and 
transcribes the early region (13, 250, 265) from 1 to about 
8 kb from the left end of Mu DNA (56) (see figure 30-2). 
In this region we find the essential genes, A and B, which 
are implicated in Mu integration and replicative transposi¬ 
tion, and the ner gene whose product negatively regulates 
the amount of early transcript during the lytic cycle (245, 
249, 266). Soon after, the transcription from Pe enters the 
semi-essential region. The kil gene is immediately tran¬ 
scribed. The other genes of the region (77)—with the exclu¬ 
sion of the gem operon, which is submitted to its own 
regulation (56)—are transcribed after a delay that is perhaps 
due to a polymerase pausing. Neither Mu protein synthesis 
nor DNA replication is required for early transcription (169, 
265,266). 

Middle Transcription 

Middle transcription starts 4-8 minutes after induction 
and increases until bacterial lysis. SI mapping experiments 
localized the start site of this transcript about 740 bp 
upstream of the C gene, which is under the control of the 
middle promoter, Pm (227, 228). The sequence analysis of 
the Pm DNA region shows a significant similarity with 
the —10 consensus sequence of E. coli promoters, but the 
homology with the —35 region is weak (227). The middle 
transcription differs from early and late transcription since 
it needs both Mu protein synthesis and Mu DNA replication. 
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The Mu middle promoter is positively regulated by one 
of the gem operon gene products, GemB/Mor (74, 172). By 
studying the kinetics of Mu transcription after induction of 
a Mu cts lysogenic strain that additionally carries an amber 
mutation in the gemB gene, Giusti et al. (74) observed that 
the pattern of early transcription (i.e., the first 10 minutes) 
was essentially identical to that of a gemB + strain. However, 
after the nrr-dependent pause, the gemB~ induced lysogen 
displayed a dramatic alteration in transcription abundance: 
the expression of late genes was both slowed down and 
reduced in the mutant with respect to that of gemB + phage. 
In addition, the (3-galactosidase activity produced by a plas¬ 
mid in which the C gene promoter. Pm, was fused with the 
lacZ gene was stimulated after infection with Mu wild-type, 
but not after infection with the gemB~ mutant. Analogous 
results were obtained by Mathee and Howe (172) who 
observed that the (3-galactosidase synthesis expressed by a 
plasmid carrying the fusion Pm-lacZ increases more than 
20 times in bacterial strains where the Mu prophage is 
induced. 

How GemB/Mor acts is poorly understood. Mathee and 
Howe (173) showed that Mor is not an alternative a subunit: 
Kahmeyer-Gabbe and Howe (126) hypothesized that it could 
function as an accessory DNA-binding protein for activation 
of E. coli RNA polymerase. Mor binds the Pm promoter in 
a sequence between —56 and —33 from the transcription 
starting point and this region is also bound by the phage 
repressor, Repc. It is therefore possible to hypothesize that 
the repressor might negatively regulate middle transcription 
(126). Since GemB/Mor is produced, at a low level, along with 
GemA in lysogens (56), it might activate Pm and produce 
enough C protein to activate the phage morphogenesis and 
lysis functions and thereby reduce cell viability. It was there¬ 
fore hypothesized (126) that Mu may have evolved additional 
safeguards to prevent expression of C and subsequent lytic 
functions in a lysogen. One of those safeguards could be 
the direct repression of Pm by Repc. A second safeguard 
could be the requirement for Mu DNA replication for Pm 
activity. 

Late Transcription 

Late transcription starts about 10-12 minutes after induc¬ 
tion and increases for 45-60 minutes until bacterial lysis. 
Three late promoters, Plys, PI, and Pp, together with the 
Pmom region (23,166,167), are involved in this transcription 
to express the genes responsible for capsid synthesis, Mu 
DNA modification, and cell lysis. The C product acts as an 
activator for the RNA polymerase necessary for this tran¬ 
scription (23, 99, 166, 167, 228). The C gene codes for a 
16.5 kDa polypeptide (140 amino acids) that is a site-specific 
DNA-binding protein (17, 23, 71, 210). 

C protein could act as an accessory factor of the 
RNA polymerase, like the cl or ell protein of X or the host 
proteins CAP or OmpR. These and other activators bind 


DNA at or upstream of the —35 regions of specific promot¬ 
ers allowing recognition and activation of the host RNA 
polymerase (166). In the four late promoters the A and T 
residues in the second and sixth positions, respectively, 
of the consensus hexamer (TATAAT) for the a 70 subunit of 
the bacterial holoenzyme RNA polymerase are conserved, 
but the —35 sequence (TTGACA) is substituted by the 
sequence (ccATAAcCcCPuG/Cac). This sequence renders late 
transcription C-dependent (167). 

Mom Transcription 

The modification function expressed by genes com—mom 
produces a-N-(9-P-D-2'-deoxy ribofuranosylpurin-6-yl)gly- 
cinamide (95, 233). The adenine residues to be modified 
form part of the deduced consensus sequence c/gAc/gNPy 
(119). The mom gene belongs to a dicistronic operon located 
at the right extremity of the Mu genome whose two genes 
(com and mom) partially overlap (for reviews see 97, 121). 
The premature expression of mom, as that of lys, is harmful 
for the host cell and therefore expression of these genes is 
strictly controlled (98, 125). In the case of mom, the regula¬ 
tion acts at both transcriptional and translational levels (97). 

The Pmom promoter region (figure 30-2) contains the 
approximate location of two promoters: momPl and momP2. 
The promoter momPl is relatively weak, with a consensus 
sequence of —35 (ACCACA) and —10 (TAAGAT), separated 
by 19 bp containing a run of six T nucleotides. RNA poly¬ 
merase does not bind the momPl promoter (17); instead it 
binds weakly to the diverging momP2 promoter, which over¬ 
laps momPl and promotes “leftward” transcription (71, 229). 
The stretch of six A nucleotides complementary to the run 
of six T nucleotides (above) appears to be part of a UP 
element, a promoter region found upstream to a —35 region 
that assists in the promotion of leftward transcription from 
momP2 (230). 

At the transcriptional level, mom expression is activated 
by the C protein, which binds the sites at —28 and —57 of 
the momPl promoter region (71, 211) and brings about an 
asymmetric distortion and unwinding of the DNA (18, 19, 
211). It has been suggested that the leftward transcription 
of momP2 might prevent low-level rightward transcription 
from momPl in two possible ways. The first is that momP2 
might compete with momPl for RNA polymerase binding 
in the absence of C protein; the second is that leftward 
transcription produces an antisense transcript that might 
prevent gin mRNA elongation into mom. The first hypothesis 
was discarded when the destruction of the —10 hexamer of 
momP2 did not lead to an increase in the basal-level activity 
of momPl hexamer (19). These facts have been interpreted 
with the idea that momP2 acts as a sink for capturing RNA 
polymerase in the vicinity of momPl, so that RNA polymer¬ 
ase is ready for occupancy at momPl when the regulated 
expression of C protein has accumulated sufficient protein 
levels to turn on mom expression (17). 
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The Com protein acts as a positive regulator (119) at the 
level of translation of the mom open reading frame (100, 
273). Com appears to relieve A translational repression 
caused by the presence of a TIS (translation-inhibitor stem- 
loop) structure (101, 271, 274). Thus, footprinting studies 
demonstrated that Com specifically bound to com—mom 
mRNA at the putative TIS site destabilized the messenger 
RNA secondary structure TO expose the ribosome-binding 
sequence and the Mom start codon (101, 274). 

Negative and Positive Control of 

Transcription 

Phage Mu can enter into a lysogenic cycle only when tran¬ 
scription of the Mu early region is repressed by the c gene 
product, Repc. By contrast, entry into the Mu lytic cycle 
occurs when gene c expression is repressed and, as a conse¬ 
quence, early genes are transcribed. After an initial early 
burst of transcription 4 minutes after virion adsorption or 
prophage induction, the transcription rate progressively 
decreases due to expression of the ner gene, the first gene 
of the early region. Early transcription then continues at a 
low level during the lytic cycle (265). The Ner protein inhibits 
the transcription from PcM, the promoter controlling gene 
c expression (249). Mutant analysis evidenced that the Ner 
binding site overlaps Pe and PcM (249) and footprint analysis 
with the purified protein localized the Ner binding site 
between sites Pe and PcM. This site shows a dyad symmetry 
and binding of Ner to the left or right half may inhibit tran¬ 
scription from Pe or PcM, respectively (79). 

There is no evidence for the existence of a positive control 
of Mu repressor synthesis. Contrasting X bacteriophage, 
there are no Mu genes equivalent to the X ell and cIII genes 
and all Mu clear-plaque mutants belong to the same com¬ 
plementation group, mapping to gene c. The identification of 
a particular phenotype in Mu cts lysogens, defective for 
replication and unable to show recovery in immunity when 
shifted from 42 to 32 °C (72), lead to suggestion of the exis¬ 
tence of a gene for immunity control, called cim. This behav¬ 
ior can only be observed in a kil background, though it 
is now thought that this immunity effect is not due to 
the action of a separate gene, but instead reflects a subtle 
interplay occurring between various elements that control 
early gene expression from the early Mu region. Various 
data also indicate that some host functions, such as IHF, 
may stimulate early Mu transcription (80). 

The Mu Repressor 

The 22 kDa Mu repressor (Repc) is a product of gene c and 
consists of 196 amino acids. Its function is the repression of 
Mu lytic growth and in doing so Repc binds to a total of 11 
sites on the Mu chromosome: nine sites spread among three 
enhancer-region operators (01, 02, and 03 as presented in 


figure 30-4) and the two promoters, Pe and PcM. The early 
promoter Pe overlaps the 02 operator site and the early 
transcription (through 02 and 03) of genes A and B that 
are necessary for lytic growth. The promoter PcM, on the 
other hand, is co-localized with 03 and drives transcription 
in the opposite direction of Pe, through 02 and 01 and into 
gene c (for a review see 80). Cooperative binding of Repc to 
both 01 and 02 is thought to inhibit transcription at the Pe 
promoter and therefore to inhibit both early transcription 
and lytic growth. Cooperative binding to all three operator 
segments (01, 02, and 03) at higher Repc concentrations 
also shuts down the PcM promoter and therefore synthesis 
of Repc (253). The repressor also inhibits transposition 
directly by competing for Mu protein-A (transposase) bind¬ 
ing to the enhancer which is located within the operator, 
between 01 and 02, and part of the LER complex required 
for transposition (above) (4,47,156). 

Two kinds of c mutants have been isolated: thermosensi¬ 
tive mutants and virulent mutants. The Mu cts62 (109) 
mutant, employed to create temperature-inducible Mu lyso¬ 
gens or miniMu plasmids (as discussed above), carries the 
substitution R470 found in the N-terminal Repc DNA bind¬ 
ing domain. This mutation reduces repressor binding to DNA 
at 30 °C and makes it very weak at 42 °C. Alternatively, a 
virulent Mu mutant that has frameshift mutations altering 
the last 11-26 C-terminal residues of Repc (68) can super- 
infect Mu lysogens to disrupt Mu immunity and induce 
lytic development (253). The alteration causes the mutant 
repressor (Vir) to be highly sensitive to ClpXP protease 
(discussed above). This Vir mutant confers a dominant nega¬ 
tive phenotype because it also promotes rapid degradation 
of the wild-type Repc repressor, even though by itself wild- 
type Repc is otherwise relatively resistant, in vivo, to ClpXP 
proteolysis (67,148,177,260). 

Mu Ner 

The first gene transcribed from Pe is ner, whose protein nega¬ 
tively regulates both c (249) and the transcription of the 
early region (10, 266). Ner is a small (74 amino acid), basic 
DNA binding protein and, on the basis of its function, can 
be grouped with the proteins regulating the lytic-lysogenic 
switch, such as X Cro (for review see 177). Despite this fact, 
there is no sequence homology between Mu Ner and Cro 
proteins or other DNA binding proteins of other phages, 
except the Ner protein expressed by the Mu-like phage 
D108 and the E. coli Nip protein (229). Constitutive ner 
expression (due to an exogenous promoter carried on a 
multicopy plasmid) shuts off early transcription by infect¬ 
ing phage and also inhibits Mu replication (78, 249). This 
phenotype, similar to Repc-mediated immunity, is called 
pseudo-immunity. 

The essential character of the Mu Ner protein is proved 
by the behavior of ner~ mutants, which greatly affect the 
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Mu lytic cycle but not Mu lysogeny, which remains normal. 
The phenotype of the double mutants ner~c~ helped the 
comprehension of the Ner mechanism of action (79). These 
mutants not only do not lysogenize but also do not form 
plaques unless they are complemented for Ner. This result 
suggested th in the absence of Ner~ phage growth is not 
due to an excess of repressor synthesis but instead could be 
explained assuming that the RNA polymerase, during c gene 
transcription, proceeds through the protein-A transposase 
binding sites found in the vicinity of gene c (figure 30-4), 
thereby preventing transposase binding (79). In other words, 
since Ner appears to be essential even in the presence of 
an inactive repressor, then a reasonable hypothesis is that 
it is repressor gene transcription and not the repressor 
protein itself that interferes with transposition. 

Commitment to Lysogeny versus 
the Lytic Cycle 

The integrated Mu DNA is committed to two alternative 
cycles: the lytic (or productive) cycle, which concerns most 
of the population, and the lysogenic cycle. It is not known 
what determines this choice, though a distinct possibility 
stems from the fact that the early transcripts from the Pe 
and PcM promoters overlap so that the transcription of 
one interferes with the transcription of the other. Thus, the 
choice between the lytic and lysogenic state is dependent, at 
least in part, on the balanced use of the two promoters. 
In addition, it has been hypothesized that the transposi¬ 
tion enhancer is a structural element critical for the lysis/ 
lysogeny choice. This structure, which includes the operator 
sites of the phage (figure 30-4), is recognized and bound 
by both the Repc repressor protein and the protein-A trans¬ 
posase, which stems from their sharing sequence homology 
in their N-terminal domains. Thus, depending on which of 
the two proteins is bound, the enhancer/operator can bring 
about two mutually opposite outcomes: turning off phage 
early gene expression or turning on transposition (201). 

Lysogenic State Derepression 

The lysogenic state is remarkably stable and spontaneous 
prophage induction very low. The lysogenic state seems to 
be maintained by cooperative binding of the Repc repressor 
to both 01 and 02, inhibiting transcription at the Pe pro¬ 
moter and thereby preventing expression of the Mu A 
and B genes needed to effect lytic growth. Nevertheless, 
the repressor, as regulator of the Mu transposition, must 
allow immune-state derepression, presumably in response 
to some change in host physiological state. Conditions and 
signals that lead to the prophage derepression, however, are 
unknown. A series of data show that lysogens, with a Mu cts 
prophage, can be derepressed during the stationary phase or 
upon carbon starvation (S derepression). This is a process 


that requires the ATP-dependent proteases ClpXP and Lon 
as well as the stationary-phase-specific sigma factor, a s 
(149,212,223). 

Even if the primary signal that leads to derepression is not 
known, lysogen immunity is influenced by the activity of 
many host-derived factors besides the ClpXP protease (177, 
221) including IHF (8, 64, 142, 222, 251, 252), FIS (20, 21), 
H-NS (62,194), DNA binding proteins (81), and the RpoS/cr s 
stationary phase sigma factor (76). From these observations 
it emerges that Repc is a sensitive receptor of degradation 
signals, apparently becoming susceptible to protease degra¬ 
dation in response to one or more E. coli signal molecules. 
Such a mechanism was initially suggested as the means by 
which proteolysis can be triggered by the mutant form of 
Repc called Vir, which can cause wild-type repressor to be 
degraded at an accelerated rate by the E. coli ClpXP protease 
(67,148,177, 260). In addition, derepression can be promoted 
by repressor peptides that include the N-terminal binding 
domain (DBD) and the C-terminal ssrA tag of Repc (193). 
The latter is an 11-residue degradation signal (AANDENYA- 
LAA) added to incomplete peptides by ssrA RNA (tmRNA) 
when ribosomes stall on mRNA (82,134, 244). The modula¬ 
tion of Repc degradation of the ClpXP protease may, there¬ 
fore, be a mechanism by which entry into lytic development 
is regulated: Repc binds to the Mu operator sequences, 01, 
02, and 03, which regulate expression of the transposition 
functions as well as the repressor gene (46,142) (as discussed 
above). 

Mu Excision 

Despite its behavior as a transposable element, mutations 
due to the integration of Mu DNA are remarkably stable: 
reversion frequencies are less than 1CT 10 (117). This lack of 
reversion of Mu-induced mutations is paradoxical when 
compared with the behavior of other transposable elements 
(both prokaryotic and eukaryotic), all of which are able to 
excise precisely. However, lysogens with the Mu wild-type 
prophage, submitted to particular conditions such as long 
periods without cell division or nutritional starvation, or 
lysogens carrying Mu prophage with a gem operon mutation, 
show relatively frequent excision. 

The first observed Mu DNA excision was performed 
by Bukhari (24) with bacterial strains lysogenic for a phage 
mutant (Mu cts62 dX) that are able to express the trans¬ 
posase after induction of the lytic cycle by thermal shift 
without either host cell killing or prophage induction. Under 
these conditions, both precise and imprecise excision events 
were obtained, all of which were associated with prophage 
loss. Mu cts62 dX excision required partial derepression 
of the Mu A gene, coding for the transposase, and unlike 
excision by other prokaryotic transposons was RecA- 
dependent (135). For example, TnlO excision is independent 
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of RecA (65,163) and Tn5 excision is similar to spontaneous 
deletion events (55), but without any involvement of the 
transposase. 

An interesting kind of Mu prophage excision was 
observed with the Mu gem 2ts mutants. Mutations induced 
by the integration of a Mu gem2ts mutant prophage can 
revert at frequencies around 1 x 10 6 , more than 10 4 -fold 
higher than that obtained with Mu wild-type. Several 
aspects characterize Mu gem2ts precise excision (69): (i) the 
phage transposase is not involved in the excision process but 
is necessary for phage reintegration (P. Ghelardini, personal 
communication): (ii) the RecA protein is not necessary: and 
(iii) revertants remain lysogenic with the prophage inserted 
elsewhere in the host genome. The site of reintegration 
somehow depends on the original site of insertion. There is 
a strong correlation between the original site of insertion 
(the donor site) and the target of the phage DNA migration 
(the receptor site) (197). 

The Invertible G-Segment 

Host Range Modification 

An aspect of Mu biology that has raised much interest is its 
ability to extend its host range through the programmed 
inversion of a segment of its genome (figure 30-7). The first 
evidence of this inversion was the observation that Mu DNA 
obtained after induction of a Mu cts lysogen, after melting 
and reannealing, showed a non-pairing bubble in 50% of 
the molecules due to the inversion of a DNA segment of 
about 3000 bp (49, 51 and, for review, 139) called the 
G-segment. This observation suggested that in half the 
phage particles the G-segment had the opposite orientation 
compared with the other half of the population. However, 


the phage produced through the lytic cycle in E. coli K12 
was in 98% of cases in an orientation conventionally indi¬ 
cated as G(+) and only in the remaining cases G(—) (132). 
Symonds and Coelho (234), with a single burst experiment, 
suggested that only the G(+) particles are able to infect 
E. coli K12 strains, whereas the G(—) cannot grow in these 
strains. In addition, this experiment permitted determi¬ 
nation that the inversion is a slow process, since many 
generations are necessary to observe a G(+) cell starting 
from a single cell lysogenic for a G(—) prophage. 

The inversion occurs in the prophage and an equilibrium 
between G(+) and G(—) cells is reached in a culture that 
has increased in size (via lysogen division) to more than 
1 x 10 7 cells. Thus, a Mu lysogen population by and large is 
a mixed population consisting of 50% G(+) and 50% G(—) 
prophages. Only the G(+) particles of this population can 
be propagated by infection in E. coli K12. The G(—) particles, 
by contrast, can infect other bacterial species including 
Citrobacter freundii (247). Various groups (139) have also 
shown that in the G(—) configuration Mu can infect E. coli 
C, Shigella sonneei, Enterobacter, and Erwinia. Meanwhile, 
while in the G(+) state, besides E.coli K12, Mu can infect 
Salmonella arizonae (127). The receptor of both G(+) and 
G(—) particles is in the membrane lipolysaccharides. No 
differences were observed by electron microscopy in the tail 
fibers but antibodies obtained against Mu (G+) particles are 
more active against these particles than against those with 
an opposite orientation, and vice versa (88). 

The product of the gin gene, adjacent to the G-segment, 
mediates the inversion (132). S~ and U~ mutants do not 
adsorb on E. coli and lack tail fibers (88). These two genes 
coding for the proteins involved in adsorption are located 
in the left portion of the G-segment, in the G(+) orientation. 
In the G(—) orientation the S' and U' genes codify for an 
alternative set of tail fiber proteins. 
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Figure 30-7 G-region inversion. The alternative expression of S U and S 'If genes, due to the programmed inversion of 
the 3 kb DNA segment between the IR sites, allows the synthesis of an alternative set of tail fiber proteins. The 5 gene 
of bacteriophage Mu is formed by an Sc sequence common to both S and S' genes, localized before the IR.L site, and 
a variable portion, Sv or S'v, localized within the invertible G-region and followed by the U and W genes, respectively. 
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The Inversion Mechanism 

Various very sensitive biological assays were developed to 
study the G-segment inversion mechanism (124, 206). 
In one of these, the lac operon of Escherichia coli is inserted 
in the G-segment fused with the 5' extremity of the Mu S' 
gene (figure 30-8). Bacterial Alac cells, harboring the plas¬ 
mid in which this insert was cloned, are not able to ferment 
lactose. If the plasmid is incubated with bacterial extracts 
overproducing Gin, the G-segment is inverted, the S' gene 
is expressed, and the hybrid product S'-(3-galactosidase is 
produced. Bacterial cells transformed with this plasmid are 
able to grow in minimal lactose medium. In another system, 
the Kn R lacking its own promoter is cloned in the G segment 
carried by a plasmid and, as predicted, inversion switches 
the harboring bacterium from Kn s to ICn R . 

Inversion, which is RecA-independent, was concluded 
to be mediated by the 21.7 kDa Gin protein (132, 133, 205) 
which catalyzes an intramolecular recombination between 
the two 34 bp inverted repeats at the extremities of the 
G-segment (205). Gin is an invertase, a family of proteins 
able to invert DNA segments, which are highly homologous 
(more than 65% of identical amino acids) and therefore able 
to complement each other (129). On this basis it is not sur¬ 
prising that their target sites are also homologous among 
themselves. The invertible region length, on the other hand, 
is not critical: deletions reducing the G-segment from 3 kb 
to 132 bp do not alter the inversion frequency. On the con¬ 
trary, the target sites orientation is very important: if they 
are colinear instead of inverted, the inversion reaction 
efficiency decreases by about two orders of magnitude. 

Inversion of the G-segment occurs by gin-mediated 
recombination between 34 bp inverted repeats located at 
the ends of the G-segment (111, 123,131, 219). The inverted 
repeats (IR.R and IR.L), which comprise the reaction target 
sites, constitute two Gin-invertase binding sites. Each 
inverted repeat is formed by two hemisites, which are not 
equivalent to each other: site I can be bound even if present 
alone, whereas site II can be bound only if site I is already 
occupied by the protein, suggesting an invertase cooperative 


binding. In any case, the recombination efficiency is maxi¬ 
mal when both sites are present. It has furthermore has 
been observed that a DNA region on the right of IR.R is 
essential for high-frequency G inversion (124). This region, 
called sis, acts as an enhancer since neither its orientation 
nor its distance from the IR sequences substantially modifies 
the stimulation effect which, however, is observed only in 
cis. The sis region is constituted by a 60 bp sequence formed 
by a triple repetition of a consensus sequence binding site for 
the bacterial protein, Fis (138). The function of Fis (140) is to 
form a DNA-protein complex that helps to correctly assem¬ 
ble the synaptic complex where the crossing-over reaction 
occurs. 


Additional Considerations 

Lysogenic Conversion 

Bacteriophages, as well as other genetic elements such as 
plasmids and transposons, besides coding for essential func¬ 
tions for their maintenance in the host cell can carry genes 
that modify seemingly unrelated aspects of the host cell 
phenotype (lysogenic conversion: see chapter 27). In the 
case of Mu bacteriophage, the lysogenic cell shows changes 
in expression of a high number of bacterial genes due to 
constitutive gemA expression. Genes affected include some 
DNA replication and cell division determinants (70, 196). 
This modification in host gene expression is accompanied 
by both a reduction in the supercoiling of the bacterial 
DNA and a reduction in bacterial gyrase activity. A model 
proposed by Ghelardini et al. (70) correlates these facts with 
hypotheses that the Mu-induced relaxation of the bacterial 
chromosome should modify the transcription of these genes 
since the expression of many bacterial genes is controlled by 
supercoiling of their respective promoter (for review see 54). 
Bacterial cell-cycle reprograming, furthermore, is observed 
upon infection with some gem mutants or in bacterial cells 
where gemA is highly expressed. 
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Figure 30-8 Biological assay for the G-loop inversion. The E. coli lac operon, inserted in the G-loop fused with 5' extremity of 
the Mu S' gene, is inverted after incubation with a bacterial extract overproducing Gin. The inversion allows lac operon 
expression and renders the bacterial cells able to grow in minimal lactose medium. 
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The importance of these facts in Mu biology is still 
poorly understood. Elowever, it is interesting to note that, 
recently, genes homologous to gemA have been identified in 
the Haemophilus influenzae Mu-like prophage FluMu (63), 
in Neisseria meningitidis prophages Pnml (137) and Pnm2 
(200), and in E. coli 0157:H7 strain Satai prophage Spl8 (13) 
(these Mu-like phage strains are reviewed below). It has been 
hypothesized that gemA could belong to the “morons” family 
of transcriptional elements (118, 185) which constist of a 
CT /a promoter, a coding region, and a factor-independent 
terminator, all located between two genes whose homologs 
are adjacent in a different phage (see chapter 27). These 
elements could be independent modules which have entered 
only recently into the phage genome (106). 

Mu as a Genetic Tool 

Random integration into the host genome, induction of 
strong polar mutations as a consequence of its integration, 
and induction of various types of genome rearrangements 
(inversions, deletions, duplications of the adjacent genes, 
replicon fusions, and host DNA segment transposition) are 
properties of Mu biology that, since their identification, at 
the basis of the development of in vivo genetic engineering. 
The development of these techniques started as soon as it 
was possible to separate the host genome rearrangements, 
during the Mu lytic cycle, from the phage lethal effects. 
This was accomplished through the construction of a large 
collection of clever Mu derivatives. These techniques were 
applied to gene manipulation not only in E. coli, the bacterial 
host where Mu was identified, but also in many Entero¬ 
bacteria, as Shigella, Salmonella ( S. typhimurium, S. typhi, 
S. montevideo), Serratia marcescens, Citrobacter freundii, 
Enterobacter sp. Klebsiella pneumoniae, and Erwinia sp. Mu 
has also been found to be effective manipulating in the 
Rhizobiaceae, Agrobacterium tumefaciens, various species of 
the Rhizobium genus, in the Pseudomonas genus, and in a 
score of additional Gram-negative bacteria (for reviews, 
see 243, 248). 

Other Mu-Like Prophages 

For a long time, Mu was the only transposable bacteriophage 
isolated, while X- and P22-like phages were independently 
isolated many times (52,113). The inexplicability of this fact 
raised the question of whether Mu is a vanishing breed 
which Larry Taylor saved from extinction (128). In reality, 
a heteroimmune Mu phage (D108) was identified in 1971 
and, thereafter, two other transposable phages were isolated, 
one infecting Pseudomonas and the other Vibrio cholerae. 
Recently, completed genome sequences have permitted iden¬ 
tification of Mu-like prophages in Haemophilus, Neisseria, 
and Deinococcus (185). A comparison between these various 
Mu-like bacteriophages (105) is given below. 


Coliphage D108 

Phage D108 is a transposable bacteriophage isolated by Mise 
(178) that is highly correlated with Mu. The two phages share 
90% homology as determined by DNA hybridization. As for 
phage Mu, phage D108 presents variable ends of bacterial 
origin at its genome extremities, an invertible region corre¬ 
sponding to the G-segment, and generates a 5 bp duplication 
in the integration target site. In addition, anti-Mu antibodies 
cross-react with phage D108 virions and all the genome 
rearrangements induced by Mu phage transposition are 
also observed with phage D108. 

Non-homologous regions between Mu and D108 corre¬ 
spond to those of the repressor and ner genes and at the 5' 
end of the A gene. The consensus sequence for transposase 
binding also shows some similarities. Functional hybrid 
phages containing the Mu left end (genes c, ner, and the 
5'-end A, the transposase gene) and the remaining D108 
genome were constructed. These phages were called MD 
phages whereas those with the D108 left end and the 
remaining Mu genome were called DM phages (242). 

Pseudomonas Transposing Phages 

A number of transposable bacteriophages were isolated in 
Pseudomonas strains by Krylov and colleagues (146). They 
all show host sequences at their genome extremities, their 
genome being about 37 kb in length. Among them, D3112- 
like phages (146, 217) and B3-like phages (4, 5) constitute 
the two main groups of P. aeruginosa transposable phages. 
Phage D3112 shows a genome structure and a life cycle 
strictly correlated to that of Mu (12,22,185,213,217) whereas 
the virion structure is morphologically more similar to that 
of X-like phages, and for this reason D3112 is classified as a 
member of the X-like Siphoviridae family (phages with long, 
noncontractile tails) (see chapter 2 for a primer on phage 
classification by virion morphology). Phage D 3112, therefore, 
resembles a kind of hybrid with a Mu-like genome organiza¬ 
tion and a X-like tail morphology (22). Upon infection, the 
D3112 phage DNA integrates in the host genome by a trans¬ 
position mechanism and, in this state, the Repc protein 
represses the lytic cycle and stabilizes integration (217). The 
D3112 complete genome sequence (257) is 37,611 bp in 
length and encodes 53 potential open reading frames, includ¬ 
ing three known genes (the repressor c gene and early genes, 
A and B). Forty-eight percent of open reading frames were 
similar to Mu-like phages and prophage sequence, including 
the proteins responsible for transposition, transcriptional 
regulation, and virion. However, phages D3112 and Mu do 
not share sequence homology (213). 

Vibrio cholerae Mu-lil<e Phages 

Little is known about VcAl and VcA2 phages, isolated 
from V. cholerae strain NIH41 (198). These two phages are 
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heteroimmune to each other and show only small morpho¬ 
logical differences (66). However, it has been observed that 
VcAl can integrate in various sites on the host genome and 
that its properties are similar to those of the Mu phage (114). 

Haemophilus influenzae prophage FluMu 

The H. influenzae Rd genome, between bases 1,559,722 and 
1,594,398, carries a Mu-like prophage called FluMu (63). 
This prophage, 34,676 bp in length, shows a relatively high 
G-C content: about 50%, compared with the 30% of the 
host genome. The FluMu genome has terminal 5'-TG-3' 
dinucleotides flanked by five direct repeats of bases (ACGCA) 
of the host DNA. Three potential transposase binding sites 
were identified at each end of the phage DNA (185). 
A comparison between the FluMu and Mu genomes reveals 
a colinear genetic organization. In the early region, only 
genes ner, A, B, gam, and gemA have significant matches 
at the amino acid level. Contrarily to Mu, FluMu appears 
not to have an invertible region and indeed a gene product 
homologous to Mu invertase has not been identified in 
FluMu. 

Neisseria meningitidis Mu-like Prophages 

Two genomes (type A and type B) of N. meningitidis have 
been completely sequenced (199, 238). A prophage, called 
Pnml (137), and three others (171), highly related to phage 
Mu, were identified in type A N. meningitidis bacteria. The 
Pmnl G-C content is 53.1% versus the 51.8% of the host 
bacterium and the phage genetic organization shows many 
characteristics in common with the Mu-like element. 

Deinococcus radiodurans R1 Mu-like 
Prophage 

D. radiodurans R1 (261) carries a Mu-like prophage called 
RadMu whose most interesting aspect is the high phyloge¬ 
netic distance between Deinocossus and the proteobacteria 
host of Mu, FluMu, and Pnml (269). 

Mu-like Prophage Spl8 in E. coli 0157:H7 
Strain Satai 

Sequence analysis of the E. coli 0157:H7 strain Satai (103) 
revealed the presence of a Mu-like prophage called Spl8 
whose genetic organization is a kind of mosaic very similar 
to the other Mu-like phages (185). 

Mu and Mu-like Phage Evolution 

One of the characteristics of the genetic organization of bac¬ 
teriophages is the clustering of genes that belong together 
functionally. This idea was expressed for the first time by 


Campbell and Botstein (29) to explain the organization of 
the X genome, which can be subdivided into different func¬ 
tional modules. Making a comparison between the organiza¬ 
tion of bacteriophage genomes is a way to group them into 
families and to compare the single representative of each 
family with those of another family and to state their phylo¬ 
genetic relationships. Of course, other relationships can be 
searched for, such as DNA, protein, or functional homol¬ 
ogies or similar genome-replication strategies. In the case of 
Mu, the transpositive module also constitutes the replicative 
and the integrative module. By observing this structure, one 
may question the evolutionary interrelationships between 
phages and transposons. For example, comparison of the 
Mu and X prophage genetic maps shows a similar 
and colinear organization of the genes (246): 

Mu: c ner AB SEE region 

C lysis genes structural genes 
X : cl cro OP b2 region 

Q lysis genes structural genes 

Comparison with the transposon structure suggests that 
the Mu replicative module is similarly comparable with the 
Tn3 transposon family. Mu has the same type of replicative 
transposition as theTn 3 family, including 5 bp duplication at 
the insertion site. Also, the transposase and the repressor 
gene are transcribed in conflict. According to Kamp (128), 
Mu could have originated from the insertion of a Tn 3-like 
transposon into the genome of a Mu progenitor, due to the 
mobilization of phage modules between two inverted copies 
of the transposon. One transposon copy at the left of this 
composite transposon could have retained most of its func¬ 
tions, while the other could have lost its transposase and 
had its TnpR resolvase changed into the Gin invertase. 
Analogously, the origin of Tn3 from a Mu deletion, as 
suggested by the miniMu behavior, cannot be excluded. 

Perspectives 

Four decades of research on Mu bacteriophage are charac¬ 
terized by a number of extraordinary discoveries crossing 
over the various fields of basic biology. A retrospective 
analysis of the history of this research shows that the 
transposition mechanism has been the main object of study, 
particularly in recent years, since this process can be finely 
dissected in vitro. From a strictly virological point of view, 
however, many aspects of Mu biology remain unresolved 
and not adequately studied. For example: the mech¬ 
anism leading to the integration of the infecting DNA, the 
role and molecular nature of the products from the semi¬ 
essential region, the conflict between commitment to the 
lysogenic versus productive cycles, the excision mechanism 
and its biological significance, and lysogenic conversion. 
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Indeed, more attention needs to be paid to these aspects 
since they could hide many fascinating features about this 
virus—a virus that has chosen transposition as a way of life. 
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T he Archaea (previously known as archaebacteria) were 
originally identified as such by Carl Woese in the late 
1970s (121) and are now accepted as one of the three major 
divisions or domains of life, along with the Bacteria and the 
Eukarya. They have recently been shown by molecular tech¬ 
niques to be ubiquitous and may even be the dominant 
organisms in the open ocean (22, 45). Phylogenetically the 
Archaea have been split into four major phyla: the Euryarch- 
aeota, the Crenarchaeota, the Korarchaeota, and the 
Nanoarchaeota. The Euryarchaeota comprise the methano- 
gens (6), the extreme halophiles, and the extremely thermo¬ 
philic Thermococcales (29). All isolated members of the 
Crenarchaeota are extreme thermophiles; however, some as 
yet uncultured members appear to be mesophilic or even 
psychrophilic, for example in marine environments (28). 
Korarchaeota are not yet represented by any cultured organ¬ 
isms (8, 56, 87). The recently discovered phylum Nanoarch¬ 
aeota is represented to date by only one apparently parasitic 
organism (38). 

In comparison with viruses of Bacteria and Eukarya, 
relatively little work has been done on viruses of Archaea. 
Most characterized viruses of Euryarchaea have typical 
“head and tail” morphology, similar to that of many bac¬ 
teriophages, and belong to the known families Myoviridae 
and Siphoviridae. By contrast, all characterized viruses of 
Crenarchaeota have unusual morphotypes, and novel viral 
families had to be introduced for their classification. 

As in the case of Bacteria, a number of fundamental 
discoveries on the nature of Archaea have been made by the 
study of their viruses. One was the identification of tran¬ 
scriptional promoter sequences in the Sulfolobus virus SSV1 
that resembled eukaryotic promoters (82) (reviewed in 125). 
The discovery that the DNA packaged in SSV1 particles 
is positively supercoiled was one of the first examples of 
stably positively supercoiled DNA in nature (62). There are 
undoubtedly many more discoveries to follow. 

Recently a number of complete sequences of archaeal 
viruses have been determined (5, 23, 48, 71, 73, 76, 110). 


Their genome sequences have few similarities to other 
known sequences in the sequence databases, although 
some of these are to sequences of bacteriophage and animal 
viruses (discussed below). There are striking similarities 
between some archaeal viral genomes, possibly in part due 
to horizontal gene transfer. Similar to genomes of bacterio¬ 
phages, those of some viruses of the Archaea have a mosaic 
structure and are subject to a great deal of modification by 
recombination, deletion, and rearrangements (51). Compari¬ 
son of complete genome sequences of archaeal hosts 
and viruses has allowed the identification of a number of 
putative cryptic proviruses (55, 95). 

Some of the viruses of Archaea have served as the basis 
for the development of molecular genetics for the Archaea, 
a major bottleneck for the study of these organisms (17, 
102), and some were discovered for that purpose. One, 
the virus 4W11 of Methanothermobacter marburgensis, was 
shown to be a general transducing virus (58) and there is 
also a report of a general transducing particle for Methano- 
coccus voltae (10). Another, SSV1, has been modified to be an 
infectious shuttle vector that replicates both in Sulfolobus 
solfataricus and in E. coli (102). 

Four of the viruses described in this chapter, the halo- 
bacterial virus <I>H and the Thermoproteus viruses TTV1 to 
TTV3, have been extensively reviewed in the previous 
edition of this work (131) and little has been published on 
them since. Therefore, data on these viruses will only be 
briefly summarized here. 

Viruses of Euryarchaeota 

The best studied of the viruses from the Euryarchaea are 
the viruses <DH from Halobacterium salinarum studied by 
Wolfram Zillig’s group in the late 1980s (reviewed in 131), the 
virus HF2 from Haloferax volcanii which has recently been 
sequenced in its entirety by Michael Dyall-Smith’s group 
(110), and OChl from the haloalkaliphile Natrialba magadii 
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studied by Angela Witte’s group and also recently sequenced 
(48). The Methanothermobacter marburgensis virus T'Ml and 
its deletion variant, 4>M2, have been extensively studied by 
Thomas Leisinger’s group and were also recently sequenced 
(76). Most of the viruses with euryarchaeal hosts have 
typical head and tail morphology and have been assigned 
to either the Myoviridae or the Siphoviridae virus families 
(http://www.ncbi.nlm.nih.gov/ICTVdb/) (for an overview of 
phage classification, see chapter 2). 


Viruses of Halobacteria 

Extreme halophiles of the class Halobacteria (31) thrive in 
conditions of near-saturating salt and are the dominant 
organisms in many hypersaline environments (69). They 
also include the extreme haloalkaliphiles, discussed in 
more detail below. All characterized halophages with the 
exceptions of Hisl (9), and the recently described His2 and 
SHI (23) (see below), have head and tail morphologies of 
varying sizes. All have double-stranded linear DNA genomes 
that vary from ca. 30 kbp (Hh3) to 230 kbp (Jal), although 
the latter was not measured directly (117). Pertinent charac¬ 
teristics and references are listed in table 31-1 and represen¬ 
tative morphologies are shown in figure 31-1. A recent 
review on novel haloarchaeal viruses is available (23). 

<DH ( Myoviridae ) 

Halophage ®H is probably the most studied virus of 
extreme halophiles and has been extensively reviewed 


(108, 131). It appeared “spontaneously”, probably via 
prophage induction, in a laboratory strain of H. halobium 
(now H. salinarum ) (93) in a manner very similar to that of 
the unrelated ®N (116). Halophage ®H has a typical 
myovirus structure with an icosahedral head and a contrac¬ 
tile tail with tail fibers (93) (figure 31-1). The ends of 
the genome are terminally redundant, indicating that it 
replicates via a headful packaging mechanism (93). It has a 
highly variable genome due to recombination with its host 
(75) and also duplication and inversion of the so-called L 
region of the viral genome (90, 92). Halophage ®H is tem¬ 
perate and its lysogeny is similar to that of coliphage PI 
(reviewed in chapter 24) as its genome is stably maintained 
as a circular plasmid and does not integrate into the host 
genome (91). 

The complete genome sequence of the host of <DH, H. 
salinarum, confirmed the lack of an integrated prophage 
(64). There are three putative homologs to the phage repres¬ 
sor protein in the H. salinarum genome but no other genes 
indicating an interaction with the genome, for example as 
an integrated prophage, except the known IS elements (see 
http://www.halolex.mpg.de/ for an overview of H. salinarum 
and its sequence). 

Repression of the prophage genome is complex. Firstly, 
it is regulated by a viral transcriptional repressor (109). 
Secondly, translation is repressed by an antisense RNA- 
based system whereby an antisense RNA binds to the sense 
T4 transcript and the 3' single-stranded region of the coding 
transcript containing the ribosome binding site is removed 
(106). Additional RNA processing occurs in <DH replication 
and is reviewed in Stolt and Zillig (108). 


Table 31-1 Viruses of Halobacteria 


Virus 

Host 

Head/tail 

(nm) 

Genome 
size (kbp) 

Genome 
G-C (%) 

Lytic/ 

tem¬ 

perate 

Eclipse 
period (h) 

Latent 
period (h) 

Salt- 

sensitive 

Burst 

size 

References 

OH 

Halobacterium 

salinarum 

64/170c 

59 

64 

T 

5.5 

7 

+ 

170 

(107, 131) 

ON 

H. salinarum 

55/80n 

56 

70 

N.D. 

10 

14 

- 

400 

(116) 

Hsl 

H. salinarum 

50/120c 

N.D. 

N.D. 

L 

12 

17 

+ 

200-300 

(111, 112) 

jal 

H. salinarum 

90/150c 

230 

N.D. 

N.D. 

2 

6 

+ 

140 

(117) 

Hhl 

H. salinarum 

60/100n 

37.2 

67 

T 

6 

12 

+ 

1100 

(72) 

Hh3 

H. salinarum 

75/50 

29.4 

62 

T 

5 

8 

+ 

425 

(72) 

S45 

H. salinarum 

40/70 

N.D. 

N.D. 

L 

N.D. 

N.D. 

+ 

1300 

(20) 

S5100 

H. salinarum 

65/76c 

N.D. 

N.D. 

L 

5.5 

9 

- 

60-65 

(19) 

S50.2 

H. salinarum 

63/78c 

N.D. 

N.D. 

L 

5.5 

9 

- 

60-65 

(21) 

S4100 

H. salinarum 

56/85c 

N.D. 

N.D. 

L 

5.5 

9 

- 

60-65 

(21) 

S41 

H. salinarum 

89/141 c 

N.D. 

N.D. 

L 

5.5 

9 

- 

60-65 

(21) 

HF1 

Halobacterium sp. 

58/94 

79.7 

N.D. 

L 

N.D. 

N.D. 

+ 

N.D. 

(23, 68) 

HF2 

Halorubrum coriense 

58/94 

77.670 

55.9 

L 

N.D. 

N.D. 

+ 

65 

(68, 110) 

Hisl 

Haloarcula hispanica 

74 x 44/7 a 

14.462 

N.D. 

L 

4-6 

3 

- 

<1 

(9) 

His2 

Haloarcula hispanica 

62 x 69/0 

16.2 

N.D. 

L 

N.D. 

1 

- 

9 

(23) 

SHI 

Haloarcula hispanica 

55 b 

27 

N.D. 

L 

N.D. 

N.D. 

N.D. 

N.D. 

(23) 

OChl 

Natrialba magadii 

70/130c 

54.498 

61.9 

T 

5 

11 

+ 

150 

(48, 120) 


c, contractile tail; n, noncontractile tail; N.D., not determined or not reported, 
dimensions of asymetrical head. 
b Spherical virus particle. 
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Figure 31-1 Viruses of Halobacteria. Transmission electron 
micrographs of viruses from extreme halophiles. All negative 
stain; bars represent lOOnm except 70 nm in panel C. 

A: HF1, B; HF2, C: <f>Ch1, D; <t>H, E: <FN, F: Hisl. Panels A 
and B reprinted from Nuttall and Dyall-Smith (68) with 
permission. Panel C courtesy of A. Witte. Panel D reprinted 
from Zillig et al. (127) with permission. Panel E reprinted 
from Vogelsang-Wenke and Oesterhelt (116) with 
permission. Panel F reprinted from Bath and Dyall-Smith (9) 
with permission. 

HF1 and HF2 (Unclassified; Siphoviridae) 

The haloviruses HF1 and FIF2 were discovered as part of 
a survey of Australian salterns for lytic viruses of 
extreme halophiles with a wider host range than just 
H. salinarum (68). They have typical head and tail morphol¬ 
ogy (figure 31-1). As with all other known viruses of 
extreme halophiles, they have double-stranded linear DNA 
genomes (table 31-1). Unlike HF1, HF2 does not have ter¬ 
minal redundancy of its genome but contains 306 bp direct 
terminal repeats and must therefore replicate in a different 
manner (67). 

The complete genome of HF2 was recently sequenced 
(110). Four tRNA genes were found in the virus genome, 


possibly because the viral genome has a different nucleotide 
distribution and codon usage compared with its host, 
Halorubrum coriense. The G-C content for H. coriense has not 
been determined; however, most Halobacteria have about 
66% G-C in their genomes (31) compared with only 56% 
of the HF2 genome. Similar to most genomes of viruses 
of Archaea, there were few (ca. 10% of the open reading 
frames in the HF2 genome) matches to any sequences in the 
public databases. The significant matches to bacteriophage 
genes include mycobacteriophage D29, Haemophilus phage 
HP1, the terminase of the enterobacterial phage RB49. 
Matches to organismal genes include the Bacteria Aquifex, 
Listeria, Sinorhizobium, and Synechocystis, and the archaeon 
Thermoplasma; for details see Tang et al. (110). The HF2 
genome also contains a DNA polymerase gene and a 
number of helicase genes in addition to an integrase/recom- 
binase gene, even though the genome does not appear to 
integrate into the host. 

The HF1 virus was isolated from the same source as HF2. 
The two viruses are morphologically identical (figure 31-1) 
and have very similar genome size and protein composi¬ 
tion. However, the two viruses have different host ranges 
(table 31-1), plaque morphologies, and sensitivity to chloro¬ 
form (68). The recently reported genome sequence of HF1 
(23) was found to be a recombinant between HF2 and a 
novel, related, but as yet undiscovered virus. About 60% of 
the genomes of HF1 and HF2 are identical whereas the rest 
of the sequence is only 87% identical (23). 

<DChl ( Myoviridae) 

The virus ®Chl was found by spontaneous lysis of a culture 
of the haloalkaliphilic archaeon Natrialba magadii, an isolate 
from the soda lake Lake Magadii in Kenya. This organism 
grows optimally at 3.5 M NaCl and pH 9.5. Unlike all other 
known viruses of Archaea, it contains both host RNA and 
viral DNA in its virion (120). The DNA genome, like those of 
all other viruses of extreme halophiles, is double-stranded 
and linear. The genome of <DChl was found to integrate into 
the host genome (120). Some of the DNA in the virus genome 
was found to be methylated by a virus-encoded methylase 
(7). Despite the low salt concentration of the E. coli cyto¬ 
plasm, this methylase was able to complement dam mutants 
of E. coli (7). Surprisingly, the sequence of the main coat 
protein of OChl was found to be similar to the sequence of 
a coat protein from 4>H, even though the hosts of these two 
organisms thrive in very different environments (49). 

The recently determined complete genome sequence of 
<DChl was found to contain further sequences similar to the 
partially sequenced ®H genome (33,48). In some regions the 
nucleotide sequence identity was as high as 97%, indicating 
potential horizontal gene transfer (48). However, <DChl is 
not able to infect H. salinarum (120). Eighty of the 98 open 
reading frames (ORFs) in the 4>Chl genome did not show 
any similarity to genes of known function, similar to HF 2 
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(see above) and the other viruses of Archaea whose 
genomes have been sequenced (see below). There are two 
putative integrase genes in the OChl genome, although it is 
unclear which, if any, of them is used for the integration of 
the provirus. Not only was the previously known methyl- 
transferase gene of <DChl (7) identified but two more 
putative methyltransferase genes were also found, including 
one 57% identical to the €>H cytosine methyltransferase 
gene (105). One ORF showed highly significant similarity to 
the PCNA protein, the processivity factor for archaeal 
and eukaryotic replicative DNA polymerases (41). It will 
be interesting to see whether it modifies the activity of 
the N. magadii DNA polymerase or a viral polymerase that 
was not identified. The genome of <PChl otherwise is 
organized in a manner resembling that of “early,” “middle” 
and “late” genes of tailed bacteriophages (16), indicating 
functional modularity. 

Other Viruses of Halobacteria 

Six different viruses of H. salinarum (previously known 
as H. cutirnbrum) have been isolated from Jamaican salt 
ponds: Jal, S45, S5100, S50.2, S4100, and S41. They seem to 
differ mostly in their mechanism of host attachment (21). 
Their infectivity depends on the salt concentration of their 
environment, like that of one of the first discovered viruses 
of extreme halophiles, Hsl (111, 112). The response to salt 
concentration is probably important in the environment as 
salt ponds are often diluted rapidly by rainfall such that the 
obligately halophilic hosts do not survive. The H. salinarum 
virus ®N also is able to maintain 50% of its infectivity after 
14 hours in distilled water (116). Its genome sequence differs 
from that of <DH, as indicated by Southern hybridization 
(116). Its genome is fully cytosine methylated (116). 

The only characterized viruses of extreme halophiles that 
do not have a “head and tail” bacteriophage-like morphology 
are Hisl and His2 from salterns and salt lakes in Australia 


(9) and the recently reported SHI (23). Like ®N, Hisl is 
resistant to low salt concentrations (9). Hisl has similar 
morphology and genome size to that of the well-studied 
fusellovirus SSV1 of the thermoacidophilic crenarchaeon 
Sulfolobus. However, it is lytic, not temperate, does not inte¬ 
grate into the host genome, has a linear rather than circular 
genome, its replication is terminally protein primed, and 
there is no sequence similarity to that of SSV1 or other 
members of the Fuselloviridae (23, 71). Therefore, Hisl 
should not be considered a member of the Fuselloviridae (23) 
and a new proposal for classification of this virus is pending 
with the International Committee on the Taxonomy of 
Viruses (ICTV) (M. Dyall-Smith, personal communication). 
SHI similarly is a lytic double-stranded DNA virus with a 
linear genome; particles are spherical and possess a lipid 
layer underneath an outer protein layer (23). 

Virus-like Particles in Hypersaline 

Environments 

Direct electron microscopic observation of samples from 
the Dead Sea after a bloom of halophilic Archaea revealed 
many virus-like particles with similar morphology to Hisl 
and His2, as well as particles with an unusual star-shaped 
morphology (70). Attempts to cultivate the halobacterial 
host strains of these particles were unsuccessful (70). The 
dominant halobacterial species in these environments 
as determined by molecular techniques has also not been 
cultivated (35). Similar “orphan” spindle-shaped virus-like 
particles have been obtained from samples from solar 
salterns (34). 

Viruses of Methanogens 

There are considerably fewer characterized viruses of 
methanogens than of either Halobacteria or Crenarchaea 
(table 31-2). It is unclear whether this is due to the absence 


Table 31-2 Viruses of Methanogens 


Virus 

Host 

Head/tail 

(nm) 

Genome 

size 

(kbp) 

Genome 

F-C 

content (%) 

Lytic/ 

temperate 

Eclipse 

period 

(h) 

Latent 

period 

(h) 

Burst 

size 

References 

'I'MI 

Methanothermobacter 

marburgensis 

55/210 

30.4 

N.D. 

L 

N.D. 

4 

8 

(59) 

'PM2 

M. marburgensis 

55/210 

26.11 

46.3 

L 

N.D. 

N.D. 

N.D. 

(44) 

'I'MI 00 

M. wolfeii 

PP 

28.79 

45.4 

PP 

N.D. 

N.D. 

N.D. 

(53) 

®F1 

M. thermoautotrophicus 

70/160 x 20n 

85 

N.D. 

L 

N.D. 

N.D. 

N.D. 

(66) 

®F3 

M. thermoautotrophicus 

55/230 x 9n 

36 

N.D. 

L 

N.D. 

N.D. 

N.D. 

(66) 

PG 

Methanobrevibacter 

smithii 

N.D. 

ca. 50 

N.D. 

L 

N.D. 

7-9 

20 

(11) 

PMS11 

Methanobrevibacter 

smithii 

N.D. 

35 

N.D. 

L 

N.D. 

N.D. 

N.D. 

(50) 

VLP 

Methanococcus voltae A3 

52x70 

23 

N.D. 

T 

N.D. 

N.D. 

N.D. 

(123) 

VTA 

M. voltae PS 

40/61 

4.4 

N.D. 

N.D. 

N.D. 

N.D. 

N.D. 

(24) 


pp, prophage; n, noncontractile; N.D., not determined or not reported. 
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of viruses or insufficient screening. The relative difficulty of 
plating the methanogens as lawns (114) may have contrib¬ 
uted to this situation. There are, however, many plasmids of 
methanogens which have been used to develop molecular 
genetic tools (115), potentially lessening the interest of 
screening for viruses. Two viruses of methanogens have 
only been mentioned in meeting abstracts and it is unclear 
why they were not further investigated (11, 50). 

4>M1, rPM2 ( Siphoviridae ) 

The best studied viruses of methanogens are rRMl and its 
deletion variant 4>M2 that infect Methanothermobacter 
marburgensis (formerly known as Methanobacterium thermo- 
autotrophicum; 118). The complete genome of rPM2 has been 
sequenced (76). T'Ml is a lytic virus that was isolated from an 
anaerobic sludge digester at a temperature of 55-60 °C (59). 
The virus particle (figure 31-2) has an icosahedral head and 
a flexible tail (59). In addition to its approximately 30.4 kbp 
linear double-stranded DNA genome, virions contained 
multimers of a cryptic plasmid from Methanothermobacter, 
pME2000 (59), that has no similarity in sequence to 4'Ml. 
The packaged genome of rPMl was found to be circularly 
permuted and to have an approximately 3 kbp terminal 
redundancy at both ends (44). Both these facts led to the 
proposition that the genome is packaged by a headful 
mechanism. 'EM! was shown also to transduce a number of 
genetic markers (58), and is the only known transducing 
virus for any of the Archaea. 


The only known host of rPMl is Methanothermobacter 
marburgensis and the viral genome did not hybridize to the 
genomic DNA of the uninfected host. There was significant 
hybridization, however, to the genomic DNA of Methanobac¬ 
terium wolfeii (59), due to sequence similarity to its defective 
prophage 'I'MIOO. Resistance of M. wolfeii to infection 
with T'Ml is probably due to superinfection immunity also 
provided by the defective prophage (53). The lysate produced 
by 'I'Ml infected M. marburgensis cells was, however, 
still able to lyse both M. wolfeii and M. marburgensis by the 
activity of a psuedomurein endoisopeptidase that is encoded 
by the viral genome (76,101). 

The totally sequenced genome of rPM2 was found to 
contain 26,111 bp with a G-C content of 46.3%, slightly 
lower than that of M. marburgensis (76, 113). The genome 
does not encode any tRNA genes and the codon usage for 
the virus is similar to that of its host (76). Of 31 ORFs greater 
than 90 amino acids in length, only six were functionally 
assigned. This percentage of identified ORFs is similar to 
those of the viruses HF2 and <l>Chl of the extreme halo- 
philes. Many ORFs appear to be cotranscribed and genes 
with putative functions are clustered, again as in bacte¬ 
riophage and halovirus genomes (33, 48, 76). There were 
ORFs similar to bacteriophage genes. Four of the 4<M2 ORFs 
were similar to genes from Bacillus subtilis phage PBSX, 
including a putative terminase gene and two structural 
protein genes, one of which was shown to be a structural 
protein of 4'M2 (76). There was also a putative portal gene 
with sequence similarity to the portal gene of phage HP1 of 



Figure 31-2 Viruses of methanogens. Transmission electron micrographs of viruses from methanogens. All negative 
stain; bars represent 70 nm, except VLP and VTA lOOnm. A: 4'MI, B: OF1, C: OF3, D: VTA, E: VLP. Panel A from Meile 
et al. (59) with permission. Panels B and C from Nolling et al. (66) with permission. Panel D from Eiserling et al. (24) 
with permission. Panel E from Wood et al. (123) with permission. 
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Haemophilus influenzae (25). Finally, the putative tail protein 
gene showed similarity to the tail protein gene from phage 
L5 of Mycobacterium tuberculosis (36, 76). Surprisingly, as in 
HF2 an integrase/recombinase gene of the phage X family 
was also found (for review of phage X biology, see chapters 
8, 9 and 27). It is intriguing that both HF2 and 4>M2 
that appear to be obligately lytic yet both contain putative 
integrase genes, like the known integrative virus 4>Chl. 
However, these gene products may be acting as recombi- 
nases instead of integrases, like phage XerC in H. influenzae 
(26,76). 

T'MIOO (Unclassified; Siphoviridae) 

The deletion that took place in the generation of 4'M2 
from 4'Ml was found to remove a 692 bp fragment between 
two 82 bp direct repeats (76). This is similar to a deletion of 
an 8 kbp fragment between direct repeats of 85 bp in the 
conjugative plasmid pNOB8 of the thermophilic crenarch- 
aeon Sulfolobus (98). The mechanisms for these deletions 
are unknown. 

The defective prophage present in the M. wolfeii genome, 
'PMIOO (104), was found to be 28,798 bp in length, slightly 
larger than the 'FM2 genome (53). Almost all of the differ¬ 
ence is made up of an insertion of 2793 bp with anomalously 
low G-C content (53). The ORFs in this insertion were similar 
to ORFs from the Methanothermobacter autotrophicus AH 
genome (100). Otherwise the 4'MIOO prophage genome has 
71% overall nucleotide sequence identity to 4'M2 (53). The 
putative packaging (pac ) site and origin of replication for 
44VI2 were more than 98% identical to those of 44V1100. 
However, this is also the region where the insertion in the 
44VI100 genome is present, so the replication defect in 
44VI100 may be due to the insertion. All but four ORFs 
in the 'FM100 genome are from 40% to 100% identical in 
predicted amino acid sequence to 4<M2 ORFs. Surprisingly, 
the pseudomurein endoisopeptidase encoded by T'MIOO is 
one of the least similar (54). There was one ORF that was 
missing and one extra one in the 'WVI100 genome. One ORF 
was very similar in its N-terminus in the two genomes but 
appeared to have insertions in the C-terminus in 'FM100. 
Since the genomes of T'MIOO and 4'M2 are so similar it is 
surprising that no virus particles have ever been observed 
in M. wolfeii cultures (59, 104). Flanking the prophage 
genome in the host are 21 bp direct repeats of all adenine 
and thymine residues that are probably the attachment 
sites (53). Whether the virally encoded integrase can act 
on these sites awaits experimental confirmation. 

Other Viruses of Methanogens 

Other viruses of methanogens have been considerably less 
well characterized (see table 31-2). A DNase A-resistant DNA 
was found in filtered cultures of Methanococcus voltae strain 
PS that was also able to transduce a number of markers (10). 


A small virus-like particle was observed in these cultures but 
it was of very low titer and could not be induced by ultra¬ 
violet irradiation or treatment with mitomycin C (24). This 
voltae transfer agent (VTA) appeared to carry only about 
4400 bp of apparently random DNA (10). The transducing 
activity and the virus-like particles were also degraded 
rapidly both aerobically and anaerobically (10). 

Two lytic viruses of Methanothermobacter thermo- 
autotrophicus (previously known as Methanobacterium 
thermoformicicum ), <DF1 and 4>F3, were discovered in a 55 °C 
experimental sludge-bed reactor (66). Both had typical head 
and tail morphologies with apparently noncontractile tails 
(see figure 31-2). They both had double-stranded DNA 
genomes; <DF1 had a linear genome of about 85 kbp in size, 
4>F3 of about 36 kbp in size. The <1>F3 genome is either 
linear with terminal redundancy or circular according to 
the physical map of the genome (66). 4>F1 had a much wider 
host range than 4>F3, including Methanothermobacter ther¬ 
mo autotrophicus AH, one of the strains for which a complete 
genome sequence is available (100). The host range of ®F1 
appeared to be related to the restriction/modification 
system encoded by the host strains as there was a strong 
selection against the CTAG sequence in the 4)FI genome 
(65). The genomes of <DF1 and <DF3 did not hybridize to each 
other, to the host chromosomal DNA or to DNA from 
the previously characterized methanogen virus, 4'Ml (see 
above) (66). 

Avirus-like particle was found in cultures of Methanococ¬ 
cus voltae strain A3 and isolated from unwashed cells (123). 
The morphology was similar to that of SSV1 of Sulfolobus (see 
below), although more heterogeneous (see figure 31-2). The 
double-stranded circular DNA of this particle was 23 kbp, 
considerably larger in size than that of SSV1. This DNA was 
also found to be integrated into the host genome (123). The 
particles appeared to have an envelope of one major protein 
species, unlike SSV1 (123). No induction or infectivity of the 
particles could be shown. No sequence data are available for 
the DNA of these virus-like particles. 

Viruses of Thermococcales 

The Thermococcales, including Pyrococcus, comprise some of 
the most thermophilic organisms. Many different isolates 
have been obtained, mostly from marine hydrothermal 
vents, both shallow and from the deep sea (132). There has 
been considerable study of enzymes from Pyrococcus species 
for biotechnology applications (40). The genomes of three 
Pyrococcus species—P. horikoshii, P. abyssi, and P. furiosus — 
have been sequenced (see 135). However, the development of 
genetic tools for these organisms has been relatively slow 
(52). Virus-like particles have been observed in Pyrococcus 
enrichment cultures from samples from deep-sea hydro- 
thermal vents. (W. Zillig and I. Holz, personal communica¬ 
tion; P. Forterre and E. Marguet, personal communication; 
30a). One head-and-tail virus from P. abyssi has been isolated 
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and its genome has been sequenced (30b). There have also 
been reports of cryptic prophage in Pyrococcus genomes. 
One 30 nm icosahedreal virus-like particle of Pyrococcus 
woeseii was detected after spontaneous lysis of the strain on 
entry to the stationary growth phase (128) but has not been 
further studied. 


Viruses of the Crenarchaeota 

The first viruses from Crenarchaeota have been character¬ 
ized by Wolfram Zillig and coworkers. The cultivated viruses 
infect extremely thermophilic members of the genera Ther- 
moproteus, Sulfolobus, and Acidianus. Many virus-like parti¬ 
cles have been observed in enrichment cultures from 
samples from environments dominated by Crenarchaea. 
Unlike viruses of the Euryarchaeota, none of these viruses 
or virus-like particles have been shown to have head-and- 
tail bacteriophage-like morphology. The unique morphol¬ 
ogy, unusual genome structures, and genome sequences 
required the introduction of three novel virus families, 
Fuselloviridae, Lipothrixviridae, and Rudiviridae, by the ICTV 
One additional family, Guttaviridae, has been proposed (4). 
Other novel viruses (see below) may necessitate the creation 
of yet another family (table 31-3; figure 31-3). 

Viruses of Thermoproteus tenax 

Thermoproteus tenax is a facultatively chemolithotropic 
strictly anaerobic crenarchaeote originally found in a mud 
hole in Iceland that grows optimally at 88°C (133). It can 
grow autotrophically by the reduction of elemental sulfur 


to hydrogen sulfide or by sulfur respiration (133). The meta¬ 
bolism of T. tenax has been well studied (99) and its genome 
is being sequenced (B. Siebers and R. Hensel, personal 
communication). Three different viruses of T. tenax —TTV1, 
TTV2, and TTV3—were found in the Krai strain from the 
Krafla volcano, Iceland on transfer from heterotropic to 
autotrophic growth conditions (42). A fourth, TTV4, was 
found in a fresh sample from the Krafla volcano and lysed 
all enrichment cultures (131). TTV1, TTV2, and TTV3 have 
linear double-stranded DNA genomes, are unrelated accord¬ 
ing to hybridization analysis, and have different protein 
composition. 

TTV1 ( Lipothrixviridae ) 

The best studied of these viruses is TTV1 and it has been 
extensively reviewed (131). The virions of TTV1 are flexible 
rods (figure 31-3) with a central core enclosed in an envelope 
that contains host lipids (131). The virion consists of at least 
four proteins: two DNA binding proteins, one envelope 
protein, and a fourth protein whose location is unknown. 
About 80% of the genome has been sequenced (EMBL acces¬ 
sion X14855). The nature of its ends is presently unclear. 

TTV1 persists in a “carrier state” in its host (42). Occasion¬ 
ally some recombination with the host genome was observed 
(131). The genome of TTV1 varies a great deal due to small 
insertions and deletions (70). The insertions do not appear 
to be IS elements as they are too small and do not have the 
inverted or direct repeats typical of these structures (63). 
The insertions have very different G-C content from the rest 
of the TTV1 genome, and it is unknown whether they are 
present in the host chromosome. 


Table 31-3 Viruses of the Crenarchaeota 


Virus 

Host 

Virion 

dimensions 

(nm) 

Genome 

size 

(kbp) 

genome 

G-C content 
(%) 

Lytic / 

temperate 

Family 

References 

TTV1 

Thermoproteus 

tenax 

40/400 

15.9 

37.0 

T 

Lipothrixviridae 

(42) 

TTV2 

T. tenax 

20/1250 

16 

N.D. 

T 

Lipothrixviridae 

(42) 

TTV3 

T. tenax 

30/2500 

27 

N.D. 

N.D. 

Lipothrixviridae 

(42) 

TTV4 

T. tenax 

30/500 

17 

N.D. 

L 

Unassigned 

(131) 

DAFV 

Acidianus 

ambivalens 

2200/27 

56 

N.D. 

T 

Lipothrixviridae 

(129) 

AFV 

Acidianus sp. 

900/24 

20.1 

N.D. 

N.D. 

"Lipothrixviridae” 

(12) 

SSV1 

Sulfolobus 

shibatae 

60x90 

15.495 

39.7 

T 

Fuselloviridae 

(57) 

SSV2 

“5. islandicus" 

55x80 

14.794 

38.5 

T 

Fuselloviridae 

(2, 103) 

SSV3 

“5. islandicus" 

55x80 

15 

N.D. 

T 

Fuselloviridae 

(126) 

SSVK1 

Sulfolobus sp. 

N.D. 

17.779 

38.8 

T 

Fuselloviridae 

(88) 

SSVY1 

Sulfolobus sp. 

N.D. 

16.473 

38.8 

T 

Fuselloviridae 

(88) 

SIRV1 

“5. islandicus" 

780/23 

32.312 

25.2 

T 

Rudiviridae 

(77) 

SIRV2 

“5. islandicus" 

900/23 

35.502 

25.3 

T 

Rudiviridae 

(77) 

SIFV 

“5. islandicus" 

1950/24 

41.050 

33.3 

T 

Lipothrixviridae 

(5) 

SNDV 

S. neozealandicus 

110-185/95-75 

30 

N.D. 

T 

“ Guttaviridae ” 

(4) 

STIV 

Sulfolobus sp. 

70 

19 

N.D. 

T 

Unassigned 

(88) 




TTV2 


Figure 31-3 Viruses of the Crenarchaeota. Transmission electron micrographs of viruses from crenarchaeal hosts. All negative 
stain; scale bars as labeled in panels A: TTV1, B: TTV2, C: TTV3, D: TTV4, E: SSV1, F: SSV2 plus pSSVx, C: STIV, H: SNDV, 

I; SIRV2, J: DAFV, l<: SIFV. Panels A to D reprinted from Zillig et al. (127) with permission. Panel E reprinted from Stedman 
et al. (102) with permission. Panel F reprinted from Arnold et al. (2) with permission. Panel C, K.M.S. unpublished. Panels 
H-K reprinted from Zillig et al. (126) with permission. 
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TTV2 andTTV3 ( Lipothrixviridae ) 

TTV2 and TTV3 were not observed as frequently as TTV1 
in cultures of the Krai strain and differ in their flexibility 
and length from TTV1 and TTV4. All tested subclones of 
T. tenax Krai produced TTV2 but production of the virus 
was usually low and only occasionally strongly induced 
(131). The large TTV3 has been observed even less frequently 
than TTV2 and very little is know about it (131). The two 
viruses are similar in morphology to the Sulfolobus virus 
SIFVand to the Acidianus virus DAFV (see below). 

TTV4 (Unclassified) 

The virions of TTV4 are stiff rods (figure 31-3). TTV4 
appears to be one of the most resistant viruses known, as 
it is stable and infectious after 1 hour of autoclaving at 
120 °C (3). TTV4 is also particularly virulent and spreads 
easily, complicating its study (131). 

Viruses of Acidianus 

Members of the archaeal genus Acidianus (Desulfurolobus) 
are thermophilic acidophiles that can grow both hetero- 
trophically (A. brierleyi) or autotrophically (A. brierleyi, 
A. ambivalens, and A. infernus ) either aerobically by oxidiz¬ 
ing sulfur to sulfuric acid or anaerobically by reducing 
sulfur to hydrogen sulfide (94, 134). They are commonly 
found in autotrophic enrichment cultures from acidic hot 
springs throughout the world. The genus groups with the 
Sulfolobus species in the Sulfolobaceae family (39). In addition 
to two viruses of Acidianus listed below, a number of virus¬ 
like particles produced by different Acidianus strains have 
recently been found in samples and enrichment cultures 
from Yellowstone National Park in Wyoming, USA (YNP) (81). 

DAFV ( Lipothrixviridae) 

A large filamentous virus about 2.2 pm long and 27 nm wide 
was found in an Acidianus isolate from Iceland (figure 31-3) 
(129). This virus, DAFV ( Desulfurolobus ambivalens filamen¬ 
tous virus), had very similar morphology toTTV2 and SIFV 
(figure 31-3) and virus production could be induced by 
ultraviolet irradiation. Surprisingly it appeared to be able 
to infect a Sulfolobus strain in a coinfection with SIRV1 
(129). Otherwise it resembled SIFV in both protein profile 
and lipid content and showed some DNA cross-hybridization 
(5). No sequence data are available for DAFV 

AFV (‘Lipothrixviridae') 

AFV1 (Acidianus filamentous virus 1) was produced by 
an Acidianus strain isolated from a hot spring in YNP (12). 
Filamentous virions, 900 nm long and 24 nm wide, are 
covered with a lipid envelope and contain at least five 


different proteins with molecular masses from 23 to 
130 kDa. With the help of unusual claw-like termini the 
virions attach to pili of host cells. The host range is confined 
to several Acidianus isolates from YNP. The genome is a 
20.1 kbp long, linear double-stranded DNA that has been 
sequenced. Of 40 ORFs longer than 48 amino acids, 12 are 
similar to ORFs present in Rudiviruses and Lipothrixviruses 
from Sulfolobus. One ORF is similar to an ORF present in 
Fuselloviruses (12). 

Viruses of Sulfolobus 

Many viruses of Sulfolobus have been studied, mainly 
because of their relative ease of isolation (129). Sulfolobus 
grows optimally at 80 °C and pH 3 and is an obligate 
aerobe (32). It grows well on solid media both as single col¬ 
onies and in lawns (129). The complete genome of Sulfolobus 
solfataricus strain P2 has recently been sequenced (97). 
Viruses of Sulfolobus have also recently been reviewed (79, 
126, 130). Several viruses of Sulfolobus have had their com¬ 
plete or nearly complete genome sequenced (see below). 

SSV1 ( Fuselloviridae ) 

The first discovered and best studied of the viruses of Sulfo¬ 
lobus is SSV1 (Sulfolobus shibatae virus 1) (reviewed in 3,126). 
SSV1 DNA was originally found as a plasmid in S. shibatae 
from Beppu Onsen, Japan (124). Later it was shown to be 
the genome of a UV- inducible virus-like particle (57). Finally, 
by showing infectivity for S. solfataricus, SSV1 was shown to 
be a virus (89). The virus particle has an unusual spindle 
shape (figure 31-4) with a short tail at one end. The SSV1 
genome is present in infected cells both as an episomal plas¬ 
mid and as an integrated provirus (124). In virus particles it 
is packaged in a positively supercoiled state (62). The viral 
genome integrates specifically into an arginyl-tRNA gene of 
the host (85). This integration disrupts the viral integrase 
gene via recombination in a 44 bp segment that duplicates 
the 3' end of the tRNA gene. Therefore integration does 
not disrupt the tRNA gene. In vitro the viral integrase 
gene product is necessary and sufficient for both the integra¬ 
tive and excisive reactions with oligonucleotide templates 
(60, 61). 

SSV1 has two highly hydrophobic coat proteins, VP1 and 
VP3, the genes of which contain a directly repeated DNA 
sequence (84). The virions appear to be packaged in infected 
host cells in regions where the cellular membrane is 
replaced by islands of viral coat proteins that, in contrast to 
the normal cell membrane, have no contact with the S-layer. 
Virus particles are formed by budding through these islands 
(57). The virus particles themselves are very stable to high 
temperature (97 °C) and low pH + (2.0), but the packaged 
DNA seems to be less so (3). Virus particles contain a very 
basic protein, VP2, that binds DNA. The genome of SSV1 
was the first of an archaeal virus to be completely sequenced 
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(71). Apart from the viral integrase gene, which belongs to 
a large family of tyrosine recombinases including proteins 
from halophages HF2, OChl, and 'TM2, none of the ORFs 
matched any known proteins (71). ORFs containing cysteine 
codons were present in only about one contiguous half of the 
viral genome that contains two transcription units, leading 
to the proposition that the virus is the product of a fusion of 
two modules (71). 

Analysis of the transcriptional promoter sequences led to 
the discovery that promoters of Archaea are similar to 
eukaryotic promoters (86). Replication is initiated by induc¬ 
tion of a short transcript, T ln d, followed by induction of T5 
and T6 transcripts (1 hour post-infection) and by prolifera¬ 
tion of the virus (about 4 hours post-infection). However, 
the other transcripts in the virus genome are constitutively 
expressed (86). Some of these viral promoters have been used 
in in vitro tests for promoter function and the T6 promoter 
was shown to be the strongest in vitro promoter known (80). 

Progress has been made in disrupting a number of genes 
in the SSV1 genome by a serial-selection technique (102). 
This technique also allowed the construction of a viral shut¬ 
tle vector for Sulfolobus and E. coli that is very promising 
for future research. It has been used to complement both 
auxotrophic mutants and a metabolic mutant (43). It has 
even been used to functionally express a protein from 
the euryarchaeon Pyrococcus furiosus (K.M.S. unpublished 
data). The putative origin of the replication of SSV1 has 
also been used as the basis for other shuttle vectors for 
S. solfataricus and E. coli (17). 

SSV2 ( Fuselloviridae ) 

A virus named SSV2 with similar morphology and genome 
size to SSV1 was found in a Sulfolobus isolate from Reykjanes 
in Iceland (103). Its host was also found to produce a smaller 
virus-like particle that contained a cryptic plasmid, pSSVx of 
the pRN family of plasmids of Sulfolobus (2,46,47). Two ORFs 
of this plasmid are homologous to ORFs present in SSV1 and 
SSV2 and are presumably involved in packaging of this plas¬ 
mid into the small virus-like particles (2). The SSV2 genome 
is 55% identical to the SSV1 genome at the nucleotide level. 
Nevertheless, most of the ORFs were arranged in the same 
order in either genome, clearly indicating that the viruses 
belong to the same family. The ORF identity ranged from 
unrecognizable to 76% (103). The integrase gene of SSV2 is 
conserved, as are genes encoding the viral coat proteins VP1 
and VP3. However, the VP2 protein is not present in the 
SSV2 genome. The integrase gene of SSV2 carries a part of 
a glycyl-tRNA gene rather than of the arginyl-tRNA gene 
(103), and accordingly the SSV2 genome is integrated into 
a glycyl-tRNA when S. solfataricus is infected with SSV2 
(Q. She, personal communication). 

ORFs that were previously shown to be important 
for virus functions of SSV1 (102) were well conserved, 
whereas those that were less important were less conserved. 


Otherwise the genome is mosaic relative to SSV1, with a 
number of ORFs present in one genome and lacking in the 
other (103). By contrast to SSV1, SSV2 is induced when its 
host reaches the stationary phase (1). Like SSV1, SSV2 has 
the cysteine codons clustered in one half of the genome, 
adding further support to the hypothesis that SSVs were 
formed by module fusion similar to some bacteriophages 
(37, 71). 

Other Fuselloviridae 

Viruses similar in morphology and genome size to SSV1 and 
SSV2 were found in about 8% of samples taken from Icelan¬ 
dic solfataric fields (126). The genomes of several of these 
have been sequenced: SSV-Ic4 and SSV-Ic5 (X. Peng and 
R. A. Garrett, unpublished data). Four other SSVs have been 
found inYNP (SSV-Y1, SSV-Y2, and SSV-Y3) and Kamchatka, 
Russia (SSV-K1) (88). 

The SSV-K1 genome, with 17,384 bp, is the largest SSV 
genome known so far (119). The genome organization is 
similar to those of SSV1 and SSV2, with the exception of a 
large insert downstream of the viral integrase gene. The 
viral genome integrates in another host tRNA gene, an 
aspartyl-tRNA gene. Like SSV2 it is also missing a VP 2 
gene, and like SSV1 it has been made into a shuttle vector 
for Sulfolobus and E. coli (K.M.S. unpublished data). The 
shape of this virus is more elongated than that of other SSVs. 

A homolog of the VP2 gene is also missing from the 
genome of SSV-Y1, which was found to integrate into yet 
another tRNA gene of the host, a lysyl-tRNA gene (119). The 
SSV-Y1 genome is also mosaic in that some ORFs are well 
conserved with other SSVs and others are not. 

In comparing sequences from all four genomes, there is 
no correlation between genetic and geographic divergence 
(119). Similar lack of correlation was obtained with partial 
sequences from three other SSVs from YNP (88). Whether 
different SSVs can simultaneously infect Sulfolobus or 
whether there is immunity to superinfection is unknown, 
although the integration sites do not overlap. 

SIFV ( Lipothrixviridae ) 

Sulfolobus islandicus filamentous virus, SIFV was originally 
isolated from an Icelandic solfatara in a screen for extrachro- 
mosomal elements of Sulfolobus (129). SIFV has a filamen¬ 
tous structure. The filament is almost twice as long as the 
diameter of its host cell, Sulfolobus islandicus strain HVE10/4 
(figure 31-3). It has an envelope that contains host lipids but 
in a different proportion to the host (5), possibly reflecting 
specific envelope formation. Inside the 4 nrn lipid envelope 
is a core made up of the DNA genome and the two major 
virion proteins (5). From electron tomography and image 
reconstruction, a model of the core was derived with the 
proteins on the inside and the DNA wrapped around 
the outside, reminiscent of chromatin (5). SIFV contains 
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spider-like tail fibers in a mop-like arrangement at its 
termini (5). Its host range is limited to a few closely related 
S. islandicus strains (5). 

The SIFVgenome does not integrate into the host genome 
but is present in a carrier state and is easily lost after culture 
dilution (5). It has been sequenced with the exception of the 
ends, the nature of which is unclear (5). The 40,266 bp 
sequenced contained 74 nonoverlapping ORFs. Similar to 
all other viruses of Archaea, only a few of these matched 
known sequences. There were also no matches with the 
partially sequenced genome of another Lipothrixvirus, 
TTV1, making the assignment to the same virus family ques¬ 
tionable. However, seven ORFs had homologs in the genome 
of the lipothrixvirus AFV1 of Acidianus, and 14 in the 
genomes of the rudiviruses SIRV1 and SIRV2 of Sulfolobus. 
One ORF, a291, had limited similarity to an ORF from SSV1 
shown by disruption to be critical for virus function (102) 
and is conserved in SSV2 (103) and other SSVs (119). There 
are three ORFs that might encode glycosyltrans- 
ferases involved in the production of the virus envelope (5). 
There were no matches to bacteriophage genomes. Virus¬ 
like particles similar to SIFV but of different lengths have 
been observed in enrichment cultures from hydrothermal 
samples fromYNP (88). 

SIRV1 and SIRV2 ( Rudiviridae ) 

The two known species of rudiviruses, SIRV1 and SIRV2 
(Sulfolobus islandicus rod-shaped virus), are similar in struc¬ 
ture and have very similar genomes. Both are maintained 
without change in their original hosts, but differ strikingly 
in infection of other hosts (77). While SIRV2 multiplies with¬ 
out change in several hosts, the genome of SIRV1 is subject 
to striking sequence variation. Genome variation is mainly 
caused by accumulation of point mutations, with a rate of 
about 10~ J substitutions per nucleotide per replication 
cycle—unprecedented for DNA viruses (77). This eventually 
leads to the selection of conditionally stable virus variants, 
coinciding with the recovery of high-fidelity replication. 
Such stable variants of SIRV1 produce further variants 
when infecting a new host, demonstrating that viral 
genome stability in a host does not exclude the potential to 
vary in other hosts (77). Mechanisms and controls under¬ 
lying regulation of replication fidelity are as yet unclear. 

The genomes of SIRV1 and SIRV2 have been completely 
sequenced (73). They are linear, covalently closed at their 
ends, 32.3 and 35.5 kbp long, and contain 2.0 and 1.6 kbp 
long inverted terminal repeats, respectively. The genomes 
contain 45 and 54 ORFs, respectively, of which 44 are 
homologous to each other. Their functions include: a 
dUTPase (functionally expressed in E. coli; 78) and a Holliday 
junction resolvase (functionally expressed in E. coli; 13). 
Predicted functions include two glycosysltransferases (73), 
a methyltransferase, a helicase, and a RecB family exo¬ 
nuclease (E. Koonin, personal communication). 


Comparison of nucleotide sequences of the two species of 
rudiviruses indicates that recombination, gene duplication, 
horizontal gene transfer, and substitution of viral genes 
by homologous host genes have contributed to their 
evolution (73). About 15% of the ORFs of both rudiviruses 
have homologs in the host chromosome, about 30% 
have homologs in the genome of the lipothrixvirus 
SIFV) and about 20% in the genome of the lipothrixvirus 
AFV1, suggesting that the two virus families form a 
superfamily (73). One ORF each of the rudiviruses has a 
homolog in the fusellovirus SSV1. No significant sequence 
similarities, however, were found in genomes of viruses 
from euryarchaeotes. 

The rudiviruses share characteristics of the linear orga¬ 
nization of their double-stranded DNA genomes—namely 
covalently closed ends, long inverted terminal repeats 
(ITRs), and the presence in terminal regions of direct repeats 
and hot spots for recombination—with large eukaryal 
double-stranded DNA viruses including poxviruses, African 
swine fever virus, and Chlorella viruses (14, 73). Similarities 
between these viruses also include modes of replication and 
resolution of replicative intermediates. Head-to-head and 
tail-to-tail linked replicative intermediates of the SIRV1 
genome were found, indicating that replication follows the 
model suggested for replication of similarly organized 
genomes of eukaryal viruses. According to this model, repli¬ 
cation is initiated by the introduction of a terminal nick 
followed by unpairing of one strand and elongation of the 
3'-0H termini of the other. On reaching the end of the 
template, the elongated strand folds back on itself and 
copies the remainder of the genome. In both virus families 
concatemeric replicative intermediates apparently are 
resolved by virus-encoded Holliday junction resolvases (14). 
Moreover, recognition sequences for viral resolvases in the 
genomes of poxviruses and rudiviruses appear to be similar 
(13). All these similarities might indicate evolutionary rela¬ 
tionships between genomes of rudiviruses and some large 
eukaryal double-stranded DNA viruses. 

SNDV (“ Guttaviridae ”) 

A novel virus, SNDV ( Sulfolobus neozealandicus droplet¬ 
shaped virus), was found in a sample from the Steaming 
Hill solfataric field in New Zealand. Like other viruses from 
Crenarchaea it has an unusual morphology, in this case of a 
"bearded”droplet 110-185 nm long by 75-90 nm wide with a 
bundle of many thin fibers on the pointed half of the virion 
(4; table 31-3, figure 31-3). Despite its droplet form, the viroid 
appeared to have helical symmetry, somewhat like the 
nucleocapsid of the human lentivirus HIV-1 (30). A novel 
virus family, “Guttaviridae” has been proposed for classifica¬ 
tion of the virus. The host range of SNDV is confined to 
several Sulfolobus isolates from New Zealand. Host cells 
could be cured by repeated dilution, indicating that the 
virus is present in a carrier state (4). It contains a 20 kbp 
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Figure 31-4 Electron micrographs of some viruses and virus-like particles from hot springs in Yellowstone National 
Park, USA. Bars represent 200 nm (lOOnm in insets). Reproduced from Rachel et al. (81), with permission. 


circular genome that is heavily modified and can be 
cleaved by the methylation-dependent restriction endo¬ 
nuclease Dprtl, but not by the methylation-sensitive restric¬ 
tion endonuclease Mbol (4). Therefore it appears that 
the viral genome, like the Natrialba magadii virus OChl, 
encodes a dam-like methylase. There are no sequence 
data available for SNDV The formation of SNDV is induced 
in the late stationary phase and a plaque test has not been 
established. 


“STIV" (Unclassified) 

Another novel virus-like particle was produced by a Sulfolo- 
bus isolate from YNP (88). Unlike all other known viruses of 
Crenarchaea, it has an icosahedral symmetry with unusual 
projections at its 5-fold axes of symmetry (figure 31-3). A 
1 nm resolution structure for this virus has been determined 
by cryo-electron microscopy (88b) . The 72 nm diameter 
icosahedron has T=31 quasi-symmetry (18). The genome is 
double-stranded DNA about 19 kbp in size and it has been 
completely sequenced. None of the ORFs analyzed to date 
have any known homologs (G. Rice and M. Young, personal 
communication). The virus infects S. solfataricus strain P2 
but its complete host range has not been determined. The 
proposed name is STIV ( Sulfolobus turreted icosahedral 
virus). Similar icosahedral virus-like particles have been 


found in Sulfolobus enrichment cultures of samples from 
Lassen Volcanic National Park, California, USA (R. Diessner 
and K.M.S. unpublished data). 


Uncultivated Viruses of Crenarchaeaota 

Virus-like particles were detected in enrichment cultures 
from water samples from hot springs in YNP (81, 88). Some 
of these were shown to be produced by isolates belonging to 
the hyperthermophilic archaeal genus Acidianus (81). Some 
of these were morphologically similar to fuselloviruses, rudi- 
viruses, and lipothrixviruses. Icosahedral particles similar 
to STIV were also observed. Additionally, particles with 
novel morphotypes were observed. The latter included 
400 nm oval particles with long (up to 200 nm) projections 
at either end (figure 31-4G, H), small (32 nm) icosahedral 
particles, zipper-like particles built up of triangular subunits 
(figure 31-4D), and pleiomorphic particles with arrowlike 
heads and apparently helical tails (figure 31-4E, F). Rod¬ 
shaped helical particles with a central cavity (figure 31-4A) 
presumably belonged to Rudiviridae. Some types of filamen¬ 
tous particles, presumably belonging to the Lipothrixviridae, 
had unusual bulbous ends (figure 31-4B) or rounded tips 
(figure 31-4C) (81). Different virus-like particles were 
observed in PEG 6000 precipitates of cell-free filtrates of 
environmental samples from various Icelandic solfataric 
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fields, some with spindle shapes, resembling SSVs, and 
others arrow-shaped, rod-shaped, and filamentous (129). In 
high-temperature neutral samples from YNP a number of 
virus-like particles have been observed which were often 
attached to Thermofilum and Thermoprotens-\\ke organisms 
(W. Z., unpublished data). 

Proviruses in Sequenced Archaeal 
Genomes 

In addition to the almost complete copy of 'RMIOO found in 
Methanothermobacter wolfeii as discussed above, a number 
of partial virus genomes have been found in the genomes 
of sequenced Archaea. The first to be found was a defective 
copy of the SSV1 genome in the genome of S. shibatae 
B12 (83), the strain from which SSV1 was originally isolated. 
By searching for disrupted integrase genes inserted into 
tRNA genes in a manner similar to SSV1, putative proviruses 
were found in the Methanococcus jannashi and Pyrococcus 
horikoshii genomes but not in those of P. abyssi, P. furiosus, 
Archaeoglobus fulgidus, or Methanothermobacter thermo- 
autotrophicum (55). In the S. solfataricus P2 genome two 
putative integrated plasmids were found that contained 
disrupted integrase genes and genes from cryptic or conju- 
gative plasmids (74). More potential integrated elements 
were found in the Aeropyrum pernix, Archaeoglobus fulgidus, 
Halobacterium salinarum, S. solfataricus, S. acidocaldarius, 
Thermoplasma acidophilum, and Thermoplasma volcanium 
genomes (96; reviewed in 95). The prevalence of these 
elements was proposed as a major means of horizontal gene 
transfer (96). None of these integrative elements, however, 
contained a complete prophage genome. Presumably the 
integrative genetic elements will only be mobile in the 
presence of an uninterrupted viral integrase. A partial inte¬ 
grated SSV genome, lacking an integrase gene, was found 
in the S. solfataricus genome by comparison of the SSV2 
sequence to the genome (103). Apparently it has been 
trapped by the action of IS elements that are known to be 
very active in Sulfolobus (97). 

Some Thoughts on the Origins and the 

Evolution of Viruses 

Except within virus families or superfamilies (e.g., Rudiviri- 
dae + Lipothrixviridae) and in special cases of sharing mobile 
genes, and integrated proviruses, different viruses do not 
show sequence homology to each other or to other organ¬ 
isms. The widespread hypothesis that viruses are derived 
from cells is therefore not substantiated by close relation¬ 
ships of viral and organismal genes. Although this might in 
part be due to the higher rate of viral evolution, even in cases 
of significant homology of viral and organismal genes the 
viral homologs in phylogenetic dendrograms are often the 


lowest branches. This is particularly obvious for DNA meta¬ 
bolizing enzymes (27). Some viral proteins, such as the 
bacteriophage T 3 andT7 DNA-dependent RNA polymerases 
and viral coat proteins, represent special solutions of evolu¬ 
tionary problems without homology in organisms. This deep 
branching of genes or complete lack of homology is to be 
expected when viruses or viral modules are not derived 
from organisms but are rather of prebiotic origins. The host 
for these viruses was the primordial soup or whatever place 
life first evolved. This is not as improbable as usually 
assumed, because the assembly of the first ancestor of all 
organisms also required the prior existence of all essential 
components in the prebiotic environment. Viruses or viral 
modules could then have developed from prebiotic macro¬ 
molecules that were not integrated into the first common 
organismal ancestor but switched to organismal hosts 
when the “nutrients” of the primordial soup had been 
consumed by the expansion of organismal life. 

Whereas all known organisms are clearly derived from 
one common ancestor, the conspicuous differences in the 
features of different virus families along with the lack of 
indications of close relationships between them suggest 
that viruses might be polyphyletic, well in accord with 
the assumption of prebiotic origins. This would imply their 
ancient ancestry. 

A strong indication of an early origin of the archaeal and 
the bacterial Myoviridae is the close relationship of the halo- 
bacterial virus 4>H to E. coli phage PI, also belonging to the 
Myoviridae, suggesting a common origin rather than conver¬ 
gent evolution (91). It is improbable that the ancestors of the 
archaeal haloviruses or of the bacterial phages could have 
surmounted the strong and complex host-range barriers 
between their archaeal and bacterial hosts, respectively. 
More probably ancestral myoviruses were already infectious 
for ancestral organisms before the separation of the archaeal 
and the bacterial lineages and their progeny coevolved with 
the separating lineages of Archaea and Bacteria. Similarly, 
in the Siphoviridae family there is a clear phylogenetic rela¬ 
tionship between some of the phages of dairy bacteria and 
the methanophage T'Ml. This relationship is apparent at the 
sequence level as well in the genomic organization (15). 

Some of the similarities indicating common origins of 
distant virus families have not been recognized at the 
sequence level but are suggested by shared special features 
of virus life-styles. This has already been discussed above in 
comparing genome structure and replication of the archaeal 
rudiviruses and some large eukcaryal DNA viruses includ¬ 
ing poxviruses, African swine fever virus, and Chlorella 
viruses. Certain features of modules of such viruses indicate 
their even earlier appearance. Although the virion structure 
of fusellovirus SSV1 differs from that of myoviruses, its 
genome organization and life cycle resembles those of 
bacteriophage 186 or similar temperate bacteriophages 
belonging to the myoviruses. Gene expression is organized 
in two modules in a manner resembling early and late 
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regulation in these phages. One of these, encompassing 16 
“late” or viral protein synthesis and assembly genes, con¬ 
tains only one ORF with a cysteine codon while the other, 
encompassing genes involved in DNA replication and lyso¬ 
geny, contains cysteine codons in 12 of 18 ORFs. Cysteine 
codons appeared supposedly late in the evolution of the 
genetic code (122), suggesting that the module practically 
devoid of them has an even earlier origin than the other, 
possibly well before the separation of the domains of life. 

The major difficulty in establishing the very probable 
polyphyletic evolution of viruses, or their exchangeable 
modules respectively, appears to be the high rate of their 
evolution. Resolution of this difficulty will probably 
require the development of novel sophisticated methods of 
phylogenetic analysis. 


References 

1. Arnold, H. P. 1998. Isolierung und Beschreibung neuer 
Viren der crearchaealen Gattung Sulfolobus. PhD thesis, 
Ludwig Maximillians University, Munich, Germany. 

2. Arnold, H. P., 0. She, H. Phan, K. Stedman, D. Prangishvili, 
I. Holz, J. K. Kristjansson, R. A. Garrett, andW. Zillig. 1999. 
The genetic element pSSVx of the extremely thermophilic 
crenarchaeon Sulfolobus is a hybrid between a plasmid and 
a virus. Mol. Microbiol. 34:217-226. 

3. Arnold, H. P., K. M. Stedman, andW. Zillig. 1999. Archaeal 
phages, pp. 76-89. In A. Granoff and R. G. Webster (eds.) 
Encyclopedia of Virology, 2nd ed, vol. 1. Academic Press, 
London. 

4. Arnold, H. P„ U. Ziese, and W. Zillig. 2000. SNDV a novel 
virus of the extremely thermophilic and acidophilic 
archaeon Sulfolobus. Virology 272:409-416. 

5. Arnold, H. P., W. Zillig, U. Ziese, I. Holz, M. Crosby, 
T. Utterback, J. F. Weidmann, J. K. Kristjanson, H. P. Klenk, 
K. E. Nelson, and C. M. Fraser. 2000. A novel lipothrix- 
virus, SIFV, of the extremely thermophilic crenarchaeon 
Sulfolobus. Virology 267:252-266. 

6. Balch, W. E., G. E. Fox, L. ]. Magrum, C. R. Woese, and 
R. S. Wolfe. 1979. Methanogens: reevaluations of a unique 
biological group. Microbiol. Rev. 43:260-296. 

7. Baranyi, U., R. Klein, W. Lubitz, D. H. Kruger, and A. Witte. 
2000. The archaeal halophilic virus-encoded Dam-like 
methyltransferase M. <E>Chl-I methylates adenine residues 
and complements dam mutants in the low salt environ¬ 
ment of Escherichia coli. Mol. Microbiol. 35:1168-1179. 

8. Barns, S. M., C. F. Delwiche, ]. D. Palmer, and N. R. Pace. 
1996. Perspectives on archaeal diversity, thermophily and 
monophyly from environmental rRNA sequences. Proc. 
Natl. Acad. Sci. USA 93:9188-9193. 

9. Bath, C., and M. L. Dyall-Smith. 1998. Hisl, an archaeal 
virus of the Fuselloviridae family that infects Haloarcula 
hispanica. J. Virol. 72:9392-9395. 

10. Bertani, G. 1999. Transduction-like gene transfer in 
the methanogen Methanococcus voltae. J. Bacteriol. 
181:2992-3002. 


11. Bertani, G., and L. Baresi. 1986. Looking for gene transfer 
mechanisms in methanogenic bacteria, pp. 398. In 
0. Kandler and W. Zillig (eds.) Archaebacteria ’85, vol. 1. 
Gustav Fischer, Stuttgart. 

12. Bettstetter, M., X. Peng, R. A. Garrett, and D. Prangishvili. 

2003. AFV1, a novel virus infecting hyper- 

thermophilic archaea of the genus Acidianus. Virology 
315:68-79. 

13. Birkenbihl, R. P., K. Neef, D. Prangishvili, and B. 
Kemper. 2001. Holliday junction resolving enzymes 
of archaeal viruses SIRV1 and SIRV2. J. Mol. Biol. 309: 
1067-1076. 

14. Blum, H., W. Zillig, S. Mallok, H. Domdey, and 

D. Prangishvili. 2001. The genome of the archaeal virus 
SIRV1 has features in common with genomes of eukar- 
yal viruses. Virology 281:6-9. 

15. Brussow, H., and F. Desiere. 2001. Comparative phage 
genomics and the evolution of Siphoviridae: insights from 
dairy phages. Mol. Microbiol. 39:213-322. 

16. Brussow, H., and R. W. Hendrix. 2002. Phage genomics: 
small is beautiful. Cell 108:13-16. 

17. Cannio, R., P. Contursi, M. Rossi, and S. Bartolucci. 1998. 
An autonomously replicating transforming vector for 
Sulfolobus solfataricus. J. Bacteriol. 180:3237-3240. 

18. Caspar, D. L. D., and A. Klug. 1962. Physical principles in 
the construction of regular viruses. Cold Spring Harb. 
Symp. Quant. Biol. 27:1-24. 

19. Daniels, L. L., and A. C. Wais. 1990. Ecophysiology of 
bacteriophage s5100 infecting Halobacterum cutirubrum. 
Appl. Environ. Microbiol. 56:3605-3608. 

20. Daniels, L. L., and A. C.Wais. 1984. Restriction and modifi¬ 
cation of halophage S45 in Halobacterium. Curr. Microbiol. 
10:133-136. 

21. Daniels, L. L., and A. C. Wais. 1998. Virulence in phage 
populations infecting Halobacterium cutirubrum. FEMS 
Microb. Ecol. 25:129-134. 

22. DeLong, E. F. 1998. Everything in moderation: 
archaea as “non-extremophiles". Curr. Opin. Genet. Dev. 
8:649-654. 

23. Dyall-Smith, M., S. L. Tang, and C. Bath. 2003. Haloarch- 
aeal viruses: How diverse are they? Res. Microbiol. 
154:309-313. 

24. Eiserling, F., A. Pushkin, M. Gingery, and G. Bertani. 1999. 
Bacteriophage-like particles associated with the gene 
transfer agent of Methanococcus voltae PS. J. Gen. Virol. 
80:3305-3308. 

25. Esposito, D., W. P. Fitzmaurice, R. C. Benjamin, S. D. 
Goodman, A. S. Waldman, and J. ]. Scocca. 1996. The 
complete nucleotide sequence of bacteriophage HP1 DNA. 
Nucleic Acids Res. 24:2360-2368. 

26. Fleischmann, R. D., M. D. Adams, 0. White, R. A. Clayton, 

E. F. Kirkness, A. R. Kerlavage, C. J. Bult, J. F. Tomb, B. A. 
Dougherty, J. M. Merrick, K. McKenney, G. Sutton, W. 
FitzHugh, C. Fields, J. D. Gocayne, J. Scott, R. Shirley, L.-I. 
Liu, A. Glodek, J. M. Kelley, J. F. Weidman, C. A. Phillips, 
T. Spriggs, E. Hedblom, M. D. Cotton, T. R. Utterback, M. C. 
Hanna, D. T. Nguyen, D. M. Saudek, R. C. Brandon, L. D. 
Fine, J. L. Fritchman, J. L. Fuhrmann, N. S. M. Geoghagen, 
C. L. Gnehm, L. A. McDonald, K. V Small, C. M. Fraser, 



VIRUSES OF ARCHAEA 513 


H. 0. Smith, and J. C. Venter. 1995. Whole-genome random 
sequencing and assembly of Haemophilus influenzae Rd. 
Science 269:496-512. 

27. Forterre, P. 1999. Displacement of cellular proteins by 
functional analogues from plasmids or viruses could 
explain puzzling phytogenies of many DNA informational 
proteins. Mol. Microbiol. 33:457-465. 

28. Garrity, G. M., and J. G. Holt. 2001. Phylum AI. 
Crenarchaeota phy. nov., p. 169. In D. Boone, R. Castenholz, 
and G. Garrity (eds.) Bergey's Manual of Systematic 
Bacteriology, 2nd edn, vol. 1. Springer, New York. 

29. Garrity, G. M., and J. G. Holt. 2001. Phylum All. Euryarch- 
aeota phy. nov., pp. 169. In D. Boone, R. Castenholz, and 
G. Garrity (eds.) Bergey’s Manual of Systematic Bacter¬ 
iology, 2nd edn, vol. 1. Springer, New York. 

30. Gelderblom, H. R. 1991. Assembly and morphology of 
HIV: potential effect of structure on viral function. AIDS 
5:617-637. 

30a. Geslin, C., M. Le Romancer, M. Gaillard, G. Erauso, and 
D. Prieur. 2003. Observation of virus-like particles in 
high temperature enrichment cultures from deep-sea 
hydrothermal vents. Res. Microbiol. 154:303-307. 

30b. Geslin, C., M. Le Romancer, G. Erauso, M. Gaillard, 
G. Perrot, and D. Prieur. 2003. PAV1, the first virus¬ 
like particle isolated from a hyperthermophilic euryar- 
chaeote, “Pyrococcus abyssi .” J. Bacteriol. 185:3888-3894. 

31. Grant, W. D., M. Kamekura.T. J. McGenity, and A. Ventosa. 
2001. Class III. Halobacteria class nov., pp. 294-334. In 
D. Boone, R. Castenholz, and G. Garrity (eds.) Bergey's 
Manual of Systematic Bacteriology, 2nd edn, vol. 1. 
Springer, New York. 

32. Grogan, D. W. 1989. Phenotypic characterization of the 
archaebacterial genus Sulfolobus: comparison of five 
wild-type strains. J. Bacteriol. 171:6710-6719. 

33. Gropp, F., B. Grampp, P. Stolt, P. Palm, andW. Zillig. 1992. 
The immunity-conferring plasmid p <1> HL from the 
Halobacterium salinarium phage OH: nucleotide sequence 
and transcription. Virology 190:45-54. 

34. Guixa-Boixareu, N., J. I. Calderon-Paz, M. Heldal, G. 
Bratbak, and C. Pedros-Alio. 1996. Viral lysis and bacte- 
rivory as prokaryotic toss factors along a salinity gradient. 
Aquat. Microb. Ecol. 11:215-227. 

35. Gurevich, P., and A. Oren. 1993. Characterization of the 
dominant halophilic archaea in a bacterial bloom in the 
Dead Sea. FEMS Microbial Ecol. 12:249-256. 

36. Hatfull, G. F., and G. J. Sarkis. 1993. DNA sequence, struc¬ 
ture and gene expression of mycobacteriophage L5: a 
phage system for mycobacterial genetics. Mol. Microbiol. 
7:395-405. 

37. Hendrix, R. W. 2002. Bacteriophages: evolution of the 
majority. Theor. Popul. Biol. 61:471-480. 

38. Huber, H., M. J. Hohn, R. Rachel, T. Fuchs, V. C. Wimmer, 
and K. 0. Stetter. 2002. A new phylum of Archaea repre¬ 
sented by a nanosized hyperthermophilic symbiont. 
Nature 417:63-67. 

39. Huber, H., and K. 0. Stetter. 2001. Family Sulfolobaceae, 
pp. 198. In D. Boone, R. Castenholz, and G. Garrity (eds.) 
Bergey’s Manual of Systematic Bacteriology, 2nd edn, 
vol. 1. Springer, New York. 


40. Ishino, Y., and S. Ishino. 2001. DNA polymerases from 
euryarchaeota. Methods Enzymo! 334:249-260. 

41. Iwai, T., N. Kurosawa, Y. H. Itoh, and T. Horiuchi. 2000. 
Phylogenetic analysis of archaeal PCNA homologues. 
Extremophiles 4:357-364. 

42. Janekovic, D., S. Wunderl, I. Holz, W. Zillig, A. Gierl, and 
H. Neumann. 1983. TTV-1, TTV-2 and TTV-3: a family of 
viruses of the extremely thermophilic anaerobic sulfur 
reducing archaebactcrium Thermoproteus tenax. Mol. Gen. 
Genet. 192:39-45. 

43. Jonuscheit, M., E. Martusewitsch, K. M. Stedman, and 
C. Schleper. 2003. A reporter gene system for the hyper¬ 
thermophilic archaeon Sulfolobus solfataricus based on 
a selectable and integrative shuttle vector. Mol. Microbiol. 
48:1241-1252. 

44. Jordan, M., L. Meile, and T. Leisinger. 1989. Organization 
of Methanobacterium thennoautotrophicum bacteriophage 
mil DNA. Mol. Gen. Genet. 220:161-164. 

45. Karner, M. B., E. F. DeLong, and D. M. Karl. 2001. Archaeal 
dominance in the mesopelagic zone of the Pacific Ocean. 
Nature 409:507-510. 

46. Keeling, P. J., H. P. Klenk, R. K. Singh, 0. Feeley, C. Schleper, 
W. Zillig, W. F. Doolittle, and C. W. Sensen. 1996. Complete 
nucleotide sequence of the Sulfolobus islandicus multicopy 
plasmid pRNl. Plasmid 35:141-144. 

47. Keeling, P. J., H. P. Klenk, R. K. Singh, M. E. Schenk, 
C. W. Sensen, W. Zillig, and W. F. Doolittle. 1998. 
Sulfolobus islandicus plasmids pRNl and pRN2 share 
distant but common evolutionary ancestry. Extremophiles 
2:391-393. 

48. Klein, R., U. Baranyi, N. Rossler, B. Greineder, H. Scholz, 
and A. Witte. 2002. Natrialba magadii virus <l>Chl: first 
complete nucleotide sequence and functional organiza¬ 
tion of a virus infecting a haloalkaliphilic archaeon. 
Mol. Microbiol. 45:851-863. 

49. Klein, R., B. Greineder, U. Baranyi, and A. Witte. 2000. 
The structural protein E of the archaeal virus phiChl: 
evidence for processing in Natrialba magadii during virus 
maturation. Virology 276:376-387. 

50. Knox, M. R., and J. E. Harris. 1986. Isolation and charac¬ 
terization of a bacteriophage of Methanobrevibacter 
smithii. In Abstracts of the XIV International Congress 
on Microbiology. IUMS, Manchester. 

51. Lawrence, J. G., G. F. Hatful! and R. W. Hendrix. 2002. 
Imbroglios of viral taxonomy: genetic exchange and 
failings of phenetic approaches. J. Bacteriol. 184: 
4891-4905. 

52. Lucas, S., L. Toffin, Y. Zivanovic, D. Charlier, H. Moussard, 
P. Forterre, D. Prieur, and G. Erauso. 2002. Construction 
of a shuttle vector for, and spheroplast transformation of, 
the hyperthermophilic archaeon Pyrococcus abyssi. Appl. 
Environ. Microbiol. 68:5528-5536. 

53. Luo, Y., P. Pfister, T. Leisinger, and A. Wasserfallen. 2001. 
The genome of archaeal prophage V PM100 encodes the 
lytic enzyme responsible for autolysis of Methanothermo- 
bacter wolfeii. J. Bacteriol. 183:5788-5792. 

54. Luo, Y., P. Pfister, T. Leisinger, and A. Wasserfallen. 2002. 
Pseudomurein endoisopeptidases PeiW and PeiP, two 
moderately related members of a novel family of proteases 



514 PART V: PHAGES BY HOST OR HABITAT 


produced in Methanothermobacter strains. FEMS Micro¬ 
biol. Lett. 208:47-51. 

55. Makino, S., N. Amano, H. Koike, and M. Suzuki. 1999. 
Prophages inserted in archaebacterial genomes. Proc. Jpn. 
Acad., Ser. B 75:166-171. 

56. Marteinsson, V T., J. K. Kristjansson, H. Kristmannsdottir, 
M. Dahlkvist, K. Saemundsson, M. Hannington, S. K. 
Petursdottir, A. Geptner, and P. Stoffers. 2001. Discovery 
and description of giant submarine smectite cones on 
the seafloor in Eyjafjordur, northern Iceland, and a novel 
thermal microbial habitat. Appl. Environ. Microbiol. 
67:827-833. 

57. Martin, A., S. Yeats, D. Janekovic, W. D. Reiter, W. Aicher, 
and W. Zillig. 1984. SAV-1, a temperate UV inducible DNA 
virus-like particle from the archaebacterium Sulfolobus 
acidocaldarius isolate B-12. EMBO J. 3:2165-2168. 

58. Meile, L., P. Abendschein, andT. Leisinger. 1990. Transduc¬ 
tion in the archaebacterium Methanobacterium thermoau- 
totrophicum Marburg. J. Bacteriol. 172:3507-3508. 

59. Meile, L., U. Jenal, D. Studer, M. Jordan, and T. Leisinger. 
1989. Characterization of 'EM!, a virulent phage of 
Methanobacterium thermoautotrophicum Marburg. Arch. 
Microbiol. 152:105-110. 

60. Muskhelishvili, G. 1994. The archaeal SSV integrase 
promotes intermolecular excisive recombination in vitro. 
Appl. Microbiol. 16:605-608. 

61. Muskhelishvili, G., P. Palm, and W. Zillig. 1993. SSV1- 
encoded site-specific recombination system in Sulfolobus 
shibatae. Mol. Gen. Genet. 237:334-342. 

62. Nadal, M., G. Mirambeau, P. Forterre, W. D. Reiter, and 
M. Duguet. 1986. Positively supercoiled DNA in a 
virus-like particle of an archaebacterium. Nature 321: 
256-258. 

63. Neumann, H., and W. Zillig. 1990. Structural variability 
in the genome of the Thermoproteus tenax virus TTV1. 
Mol. Gen. Genet. 222:435-437. 

64. Ng, W. V, S. P. Kennedy, G. G. Mahairas, B. Berquist, 
M. Pan, H. D. Shulda, S. R. Lasky, N. S. Baliga, V. Thorsson, 
J. Sbrogna, S. Swartzell, D. Weir, J. Hall, T. A. Dahl, R. Welti, 
Y. A. Goo, B. Leithauser, K. Keller, R. Cruz, M. J. Danson, 
D. W. Hough, D. G. Maddocks, P. E. Jablonski, M. P. Krebs, 
C. M. Angevine, H. Dale, T. A. Isenbarger, R. F. Peck, 
M. Pohlschroder, J. L. Spudich, K. W. Jung, M. Alam, 
T. Freitas, S. Hou, C. J. Daniels, P. P. Dennis, A. D. Omer, 
H. Ebhardt, T. M. Lowe, P. Liang, M. Riley, L. Hood, and 
S. DasSarma. 2000. Genome sequence of Halobacterium 
species NRC-1. Proc. Natl. Acad. Sci. USA 97:12176-12181. 

65. Nolling, J., and W. M. de Vos. 1992. Identification of the 
CTAG-recognizing restriction-modification systems 
MthZI and MthFI from Methanobacterium thermoformici- 
cum and characterization of the plasmid-encoded mthZIM 
gene. Nucleic Acids Res. 20:5047-5052. 

66. Nolling, J., A. Groffen, andW. M. DeVos. 1993. ®F1 and OF3, 
two novel virulent, archaeal phages infecting different 
thermophilic strains of the genus Methanobacterium. 
J. Gen. Microbiol. 139:2511-2516. 

67. Nuttall, S. D., and M. L. Dyall-Smith. 1995. Halophage 
HF2: genome organization and replication strategy. 
J. Virol. 69:2322-2327. 


68. Nuttall, S. D., and M. L. Dyall-Smith. 1993. HF1 and HF2: 
novel bacteriophages of halophilic archaea. Virology 
197:678-684. 

69. Oren, A. 1994. The ecology of the extremely halophilic 
archaea. FEMS Microbiol. Rev. 13:415-440. 

70. Oren, A., G. Bratbak, and M. Heldal. 1997. Occurrence 
of virus-like particles in the Dead Sea. Extremophiles 
1:143-149. 

71. Palm, P., C. Schleper, B. Grampp, S. Yeats, P. McWilliam, 
W. D. Reiter, and W. Zillig. 1991. Complete nucleotide 
sequence of the virus SSV1 of the archaebacterium 
Sulfolobus shibatae.Virology 185:242-250. 

72. Pauling, C. 1982. Bacteriophages of Halobacterium 
halobium: isolated from fermented fish sauce and primary 
characterization. Can. J. Microbiol. 28:916-921. 

73. Peng, X., H. Blum, 0. She, S. Mallok, K. Brugger, 
R. A. Garrett, W. Zillig, and D. Prangishvili. 2001. Sequen¬ 
ces and replication of genomes of the archaeal rudi- 
viruses SIRV1 and SIRV2: relationships to the archaeal 
lipothrixvirus SIFV and some eukaryal viruses. Virology 
291:226-234. 

74. Peng, X., I. Holz.W. Zillig, R. A. Garrett, and Q. She. 2000. 
Evolution of the family of pRN plasmids and their integrase- 
mediated insertion into the chromosome of the crenarch- 
aeon Sulfolobus solfataricus. J. Mol. Biol. 303:449-454. 

75. Pfeifer, F. 1987. Genetics of Halobacteria, pp. 105-133. In 
F. Rodriguez-Valera (ed.) Halophilic Bacteria. CRC Press, 
Boca Raton, Fla. 

76. Pfister, P., A. Wasserfallen, R. Stettler, and T. Leisinger. 
1998. Molecular analysis of Methanobacterium phage 
4dM2. Mol. Microbiol. 30:233-244. 

77. Prangishvili, D., H. P. Arnold, D. Gotz, U. Ziese, I. Holz, 
J. K. Kristjansson, and W. Zillig. 1999. A novel virus family, 
the Rudiviridae: structure, virus-host interactions and 
genome variability of the sulfolobus viruses SIRV1 and 
SIRV2. Genetics 152:1387-1396. 

78. Prangishvili, D., H. P. Klenk, G. Jakobs, A. Schmiechen, 
C. Hanselmann, I. Holz, and W. Zillig. 1998. Biochemical 
and phylogenetic characterization of the dUTPase from 
the archaeal virus SIRV J. Biol. Chem. 273:6024-6029. 

79. Prangishvili, D., K. Stedman, and W. Zillig. 2001. Viruses 
of the extremely thermophilic archaeon Sulfolobus. 
Trends Microbiol. 9:39-43. 

80. Oureshi, S. A., and S. P. Jackson. 1998. Sequence-specific 
DNA binding by the S. shibatae TFIIB homolog, TFB, and 
its effect on promoter strength. Mol. Cell 1:389-400. 

81. Rachel, R., M. Bettstetter, B. P. Hedlund, M. Haring, 
A. Kessler, K. 0. Stetter, and D. Prangishvili. 2002. 
Remarkable morphological diversity of viruses and virus¬ 
like particles in hot terrestrial environments. Arch. Virol. 
147: 2419-2429. 

82. Reiter,W. D., U. Hudepohl, andW. Zillig. 1990. Mutational 
analysis of an archaebacterial promoter: essential role 
of a TATA box for transcription efficiency and start-site 
selection in vitro. Proc. Natl. Acad. Sci. USA 87:9509-9513. 

83. Reiter, W. D., and P. Palm. 1990. Identification and charac¬ 
terization of a defective SSV1 genome integrated into a 
tRNA gene in the archaebacterium Sulfolobus sp. B12. 
Mol. Gen. Genet. 221:65-71. 



VIRUSES OF ARCHAEA 515 


84. Reiter, W. D., P. Palm, A. Henschen, F. Lottspeich, W. Zillig, 
and B. Grampp. 1987. Identification and characterization 
of the genes encoding three structural proteins of 
the Sulfolobus virus-like particle SSV1. Mol. Gen. Genet. 
206:144-153. 

85. Reiter, W. D., P. Palm, and S. Yeats. 1989. Transfer RNA 
genes frequently serve as integration sites for prokaryotic 
genetic elements. Nucleic Acids Res. 17:1907-1914. 

86. Reiter, W. D., P. Palm, S. Yeats, and W. Zillig. 1987. Gene 
expression in archaebacteria physical mapping of consti¬ 
tutive and UV-inducible transcripts from the Sulfolobus 
virus-like particle SSV1. Mol. Gen. Genet. 209:270-275. 

87. Reysenbach, A. L., M. Ehringer, and K. Hershberger. 

2000. Microbial diversity at 83 °C in Calcite Springs, 
Yellowstone National Park: another environment where 
the Aquificales and “Korarchaeota" coexist. Extremophiles 
4:61-67. 

88a. Rice, G., K. M. Stedman, J. Snyder, B. Wiedenheft, 

D. Willits, S. Brumfield, T. McDermott, and M. J. Young. 

2001. Novel viruses from extreme thermal environments. 
Proc. Natl. Acad. Sci. USA 98:13341-13345. 

88b. Rice, G., L. Tang, K. Stedman, F. Roberto, J. Spuhler, 

E. Gillitzer, J. E. Johnson, T. Douglas, and M. Young. 2004. 
The structure of a thermophilic archaeal virus shows a 
double-stranded DNA viral capsid type that spans all 
domains of life. Proc. Natl. Acad. Sci. USA 10.1:7716-7720. 

89. Schleper, C., K. Kubo, and W. Zillig. 1992. The particle 
SSV1 from the extremely thermophilic archaeon 
Sulfolobus is a virus: demonstration of infectivity and 
of transfection with viral DNA. Proc. Natl. Acad. Sci. 
USA 89:7645-7649. 

90. Schnabel, H. 1984. Integration of plasmid p-OHl into phage 
genomes during infection of Halobacterium halobium 
R-l-L with phage OHI. Mol. Gen. Genet. 197:19-23. 

91. Schnabel, H., and W. Zillig. 1984. Circular structure of 
the genome of phage OH in a lysogenic Halobacterium 
halobium. Mol. Gen. Genet. 193:422-426. 

92. Schnabel, H., and W. Zillig. 1982. Structural variations in 
the DNA of Halobacterium halobium phage OH. Zentralbl. 
Bakteriol. Mikrobiol. Hyg., 1, Abt. Orig., A. 253:35-36. 

93. Schnabel, H.,W. Zillig, M. Pfaeffle, R. Schnabel, H. Michel, 
and H. Delius. 1982. Halobacterium halobium phage OH. 
EMB0J. 1:87-92. 

94. Segerer, A., K. 0. Stetter, and E. Klink. 1985. Two contrary 
modes of chemolithotrophy in the same archaebacterium. 
Nature 313:787-789. 

95. She, Q., K. Brugger, and L. Chen. 2002. Archaeal integra¬ 
tive genetic elements and their impact on genome evolu¬ 
tion. Res. Microbiol. 153:325-332. 

96. She, 0., X. Peng, W. Zillig, and R. A. Garrett. 2001. Gene 
capture in archaeal chromosomes. Nature 409:478. 

97. She, Q., R. K. Singh, F. Confalonieri.Y. Zivanovic, G. Allard, 

M. J. Awayez, C. C.-Y. Chan-Weihere, I. G. Clausen, 

B. A. Curtis, A. De Moors, G. Erauso, C. Fletcher, P. M. K. 
Gordon, I. Heikamp-de Jong, A. C. Jeffries, C. J. Kozera, 

N. Medina, X. Peng, H. P. Thi-Ngoc, P. Redder, M. E. Schenk, 

C. Theriault, N. Tolstrup, R. L. Charlebois, W. F. Doolittle, 
M. Duguet, T. Gaasterland, R. A. Garrett, M. A. Ragan, 
C. W. Sensen, and J. Van der Oost. 2001. The complete 


genome of the crenarchaeon Sulfolobus solfataricus P2. 
Proc. Natl. Acad. Sci. USA 98:7835-7840. 

98. She, Q. X., H. E. Phan, R. A. Garrett, S. V Albers, 
K. M. Stedman, and W. Zillig. 1998. Genetic profile of 
pNOB8 from Sulfolobus: the first conjugative plasmid 
from an archaeon. Extremophiles 2:417-425. 

99. Siebers, B., V F. Wendisch, and R. Hensel. 1997. Carbo¬ 
hydrate metabolism in Thermoproteus tenax: in vivo utili¬ 
zation of the non-phosphorylative Entner-Doudoroff 
pathway and characterization of its first enzyme, glucose 
dehydrogenase. Arch. Microbiol. 168:120-127. 

100. Smith, D. R., L. A. Doucette-Stamm, C. Deloughery, H. 
Lee, J. Dubois, T. Aldredge, R. Bashirzadeh, D. Blakely, R. 
Cook, K. Gilbert, D. Harrison, L. Hoang, P. Keagle, W. 
Lumm, B. Pothier, D. Oiu, R. Spadafora, R. Vicaire, Y. 
Wang, J. Wierzbowski, R. Gibson, N. Jiwani, A. Caruso, D. 
Bush, J. N. Reeve, et al. 1997. Complete genome sequence 
of Methanobacterium thermoautotrophicum AH: func¬ 
tional analysis and comparative genomics. J. Bacteriol. 
179:7135-7155. 

101. Stax, D., R. Hermann, R. Falchetto, and T. Leisinger. 
1992. The lytic enzyme in bacteriophage *FM1 induced 
lysates of Methanobacterium thermoautotrophicum 
Marburg. FEMS Microbial Lett. 100:433-438. 

102. Stedman, K., C. Schleper, E. Rumpf, and W. Zillig. 1999. 
Genetic requirements for the function of the archaeal 
virus SSV1 in Sulfolobus solfataricus: construction and 
testing of viral shuttle vectors. Genetics 152:1397-1405. 

103. Stedman, K. M., Q. She, H. Phan, H. P. Arnold, I. Holz, 
R. A. Garrett, and W. Zillig. 2003. Relationships between 
fuselloviruses infecting the extremely thermophilic 
archaeon Sulfolobus: SSV1 and SSV2. Res. Microbiol. 
154:295-302. 

104. Stettler, R., C. Thurner, D. Stax, L. Meile, andT. Leisinger. 
1995. Evidence for a defective prophage on the chromo¬ 
some of Methanobacterium wolfei. FEMS Microbiol. Lett. 
132:85-89. 

105. Stolt, P., B. Grampp, and W. Zillig. 1994. Genes for DNA 
cytosine methyltransferases and structural proteins, 
expressed during lytic growth by the phage OH of the 
archaebacterium Halobacterium salinarium. Biol. Chem. 
Hoppe Seyler 375:747-757. 

106. Stolt, E, and W. Zillig. 1993. Antisense RNA mediates 
transcriptional processing in an archaebacterium, indi¬ 
cating a novel kind of RNase activity. Mol. Microbiol. 
7:875-882. 

107. Stolt, E, and W. Zillig. 1995. Archaebacterial bacterio¬ 
phages. In R. Webster and A. Granoff (eds.) Encyclopedia 
of Virology Flus. Academic Fress, London. 

108. Stolt, P., and W. Zillig. 1994. Gene regulation in halophage 
OH; more than promoters. Appl. Microbiol. 16:591-596. 

109. Stolt, E, and W. Zillig. 1994. Transcription of the haloph¬ 
age OH repressor gene is abolished by transcription 
from an inversely oriented lytic promoter. FEBS Lett. 
344:125-128. 

110. Tang, S. L., S. Nuttall, K. Ngui, C. Fisher, P. Lopez, and M. 
Dyall-Smith. 2002. HF2: a double-stranded DNA tailed 
haloarchaeal virus with a mosaic genome. Mol. Microbiol. 
44:283-296. 



516 PART V: PHAGES BY HOST OR HABITAT 


111. Torsvik, T., and I. D. Dundas. 1974. Bacteriophage of 
Halobacterium salinarum. Nature 248:680-681. 

112. Torsvik, T., and I. D. Dundas. 1980. Persisting phage 
infection in Halobacterium salinarum str.l. ]. Gen. Virol. 
47:29-36. 

113. Touzel, J. P., E. C. De Macario, ]. Nolling, W. M. De Vos, 
T. Zhilina, and A. M. Lysenko. 1992. DNA relatedness 
among some thermophilic members of the genus 
Methanobacterium: emendation of the species Methano- 
bacterium thermoautotrophicum and rejection of 
Methanobacterium thermoformicicum as a synonym of 
Methanobacterium thermoautotrophicum. Int. J. Syst. 
Bacteriol. 42:408-411. 

114. Tumbula, D. L., T. L. Bowen, and W. B. Whitman. 
1995. Growth of methanogens on solidified medium, 
pp. 49-55. In K. R. Sowers and H. J. Schreier (eds.) 
Archaea: Methanogens. A Laboratory Manual. Cold 
Spring Harbor Laboratory Press, Plainview. 

115. Tumbula, D. L., and W. B. Whitman. 1999. Genetics of 
Methanococcus: possibilities for functional genomics in 
Archaea. Mol. Microbiol. 33:1-7. 

116. Vogelsang-Wenke, H., and D. Oesterhelt. 1986. 

Halophage ON, pp. 403-405. In 0. Kandler and 
W. Zillig (eds.) Archaebacteria'85, vol. 1. Gustav Fischer, 
Stuttgart. 

117. Wais, A. C., M. Kon, R. E. MacDonald, and B. D. 
Stollar. 1975. Salt-dependent bacteriophage infecting 
Halobacterium cutirubrum and H. halobium. Nature 
256:314-315. 

118. Wasserfallen, A., J. Nolling, P. Pfister, J. Reeve, and 
E. Conway de Macario. 2000. Phylogenetic analysis of 
18 thermophilic Methanobacterium isolates supports the 
proposals to create a new genus, Methanothermobacter 
gen. nov., and to reclassify several isolates in three 
species, Methanothermobacter thermautotrophicus comb, 
nov., Methanothermobacter wolfeii comb, nov., and Metha¬ 
nothermobacter marburgensis sp. nov. Int. J. Syst. Evol. 
Microbiol. 50:43-53. 

119. Wiedenheft, B., K. Stedman, F. Roberto, D. Willits, 
A. K. Gleske, L. Zoeller, J. Snyder, T. Douglas, and 
M. Young. 2004. Comparative genomic analysis of 
hyperthermophilic archaeal Fuselloviridae viruses. J. Virol. 
78:1954-1961. 

120. Witte, A., U. Baranyi, R. Klein, M. Sulzner, C. Luo, G. 
Wanner, D. H. Kruger, and W. Lubitz. 1997. Characteri¬ 
zation of Natronobacterium magadii phage ®Chl, a 
unique archaeal phage containing DNA and RNA. Mol. 
Microbiol. 23:603-616. 

121. Woese, C. R., and G. E. Fox. 1977. Phylogenetic structure of 
the prokaryotic domain: the primary kingdoms. Proc. 
Natl. Acad. Sci. USA 74:5088-5090. 

122. Wong, J. T. 1975. A co-evolution theory of the genetic code. 
Proc. Natl. Acad. Sci. USA 72:1909-1912. 


123. Wood, A. G.,W. B.Whitman, andj. Konisky. 1989. Isolation 
and characterization of an archaebacterial viruslike 
particle from Methanococcus voltae A3. J. Bacteriol. 
171:93-98. 

124. Yeats, S., P. McWilliam, and W. Zillig. 1982. A plasmid in 
the archaebacterium Sulfolobus solfataricus. EMBO J. 
1:1035-1038. 

125. Zillig, W. 1991. Comparative biochemistry of Archaea and 
Bacteria. Curr. Opin. Genet. Dev. 1:544-551. 

126. Zillig, W., H. P. Arnold, I. Holz, D. Prangishvili, 
A. Schweier, K. Stedman, 0. She, H. Phan. R. Garrett, 
and ]. K. Kristjansson. 1998. Genetic elements in the 
extremely thermophilic archaeon Sulfolobus. Extremo- 
philes 2:131-140. 

127. Zillig, W., F. Gropp, A. Henschen, H. Neumann, P. Palm, 
W. D. Reiter, M. Rettenberger, H. Schnabel, and S. Yeats. 
1985. Archaebacterial virus host systems. Appl. 
Microbiol. 7:58-66. 

128. Zillig, W., I. Holz, H. P. Klenk, J. Trent, S. Wunderl, 
D. Janekovic, E. Imsel, and B. Haas. 1987. Pyrococcus 
woesei new-species an ultra-thermophilic marine archae¬ 
bacterium representing a novel order Thermococcales. 
Appl. Microbiol. 9:62-70. 

129. Zillig, W., A. Kletzin, C. Schleper, I. Holz, D. Janekovic, 
J. Hain, M. Lanzendoerfer, and J. K. Kristjansson. 1994. 
Screening for Sulfolobales, their plasmids and their 
viruses in Icelandic solfataras. Appl. Microbiol. 16: 
609-628. 

130. Zillig, W, D. Prangishvilli, C. Schleper, M. Elferink, I. Holz, 
S. Albers, D. Janekovic, and D. Gotz. 1996. Viruses, 
plasmids and other genetic elements of thermophilic 
and hyperthermophilic Archaea. FEMS Microbiol. Rev. 
18:225-236. 

131. Zillig, W.,W. D. Reiter, P. Palm, F. Gropp, H. Neumann, 
and M. Rettenberger. 1988. Viruses of Archaebacteria, 
pp. 517-558. In R. Calendar (ed.) The Bacteriophages, 
vol. 1. Plenum Press, New York. 

132. Zillig, W., and A. L. Reysenbach. 2001. Thermococcales, 
pp. 341. In D. Boone, R. Castenholz, and G. Garrity (eds.) 
Bergey's Manual of Systematic Bacteriology, 2nd edn, 
vol. 1. Springer, New York. 

133. Zillig, W., J. Tu, and I. Holz. 1981. Thermoproteales: a 
third order of thermoacidophilic archaebacteria. Nature 
293:85-86. 

134. Zillig, W., S. Yeats, I. Holz, A. Boeck, M. Rettenberger, 
F. Gropp, and G. Simon. 1986. Desulfurolobus ambivalens 
new-genus new-species an autotrophic archaebacterium 
facultatively oxidizing or reducing sulfur. Appl. Micro¬ 
biol. 8:197-203. 

135. Zivanovic, Y., P. Lopez, H. Philippe, and P. Forterre. 
2002. Pyrococcus genome comparison evidences chromo¬ 
some shuffling-driven evolution. Nucleic Acids Res. 
30:1902-1910. 



32 


Phages of Cyanobacteria 

NICHOLAS H. MANN 


T he scientific importance of the phages of cyanobacte¬ 
ria (cyanophages) is intimately associated with the 
ecological significance of their hosts. Cyanobacteria are 
arguably the most diverse and widely distributed group 
of eubacteria on the planet and play central roles in 
major biogeochemical processes, such as the carbon and 
nitrogen cycles. Cyanobacteria exist in a wide range of 
freshwater and marine environments, ranging from ther¬ 
mophilic to psycrophilic, and terrestrial environments, 
including those subject to periodic desiccation. By virtue of 
their higher plant-like oxygenic photosynthetic apparatus 
they contribute significantly to the maintenance of the 
Earths atmosphere, in terms of both oxygen production 
and carbon dioxide fixation. Consequently, the ability of 
cyanophages to determine the population structures and 
genetic diversity of cyanobacteria, as well as to potentially 
influence the dynamics of biogeochemical processes, gives 
them a unique ecological significance. 

The study of cyanobacteria has a long history. Indeed, 
their taxonomy was discussed by Linnaeus in 1753 (61; as 
cited by 124). Thus, it seems strange that the first cyano- 
phage, which caused lysis of several species of cyanobacte¬ 
ria of the genera Lyngbya, Plectonema, and Phormidium, 
was not discovered until 1963 (by Safferman and Morris 
(90), during the characterization of sewage settling 
ponds in the neighborhood of Cincinnati for the Environ¬ 
mental Protection Agency). Why were phages infecting 
cyanobacteria not found prior to 1963? One possible reason 
is that before 1963 cyanobacteria were called blue-green 
algae and considered to be eukaryotic. Subsequently the 
cyanobacteria were shown to have lysozyme-sensitive 
cell walls, to be sensitive to penicillin, to have diamino- 
pimelic acid in their peptidoglycan, and to have typical 
prokaryotic ribosomes. These features established their 
prokaryotic nature. The discovery of phages that could 
be characterized by typical one-step growth curves and 
counted by the formation of plaques on confluent lawns of 
the host helped to confirm this assignment. 

Following the recognition that viruses were abundant 
in sea water (12) and that marine cyanobacteria were 


visibly phage-infected (82), the first marine cyano¬ 
phages were isolated in 1991 (72) while the first phages 
infecting the abundant marine Synechococcus strains 
were isolated in 1993 (112, 121, 128). There is an enormous 
diversity of cyanobacteria in the marine environment, 
which is well reviewed by Golubic et al. (43). Work on 
marine cyanophages, however, has been focused largely, 
though not exclusively, on those which infect unicellular 
cyanobacteria of the genera Synechococcus and Prochlom- 
coccus. These organisms dominate the prokaryotic compo¬ 
nent of the picophytoplankton and make a very significant 
contribution to overall primary production in the oceans 
(40, 60, 62, 118). Cyanophages in general and marine 
cyanophages in particular have been the topics of 
recent reviews (66, 110). In this chapter there is no attempt 
to comprehensively review the cyanophage literature, but 
rather attention is focused on the present state of knowl¬ 
edge regarding the nature of cyanophages, their interactions 
with their host, and their ecological impact. 

The Cyanobacterial Hosts 

For a variety of reasons an account of cyanophages 
must begin with a consideration of the hosts they infect. 
The cyanobacteria occupy an extraordinarily diverse 
range of environments and often make major contributions 
to biogeochemical processes. Their oxygenic photoauto- 
trophic mode of nutrition is distinct in the prokaryotic 
kingdom, being shared only with the prochlorophytes, and 
has significant implications for phage replication. Their 
cell envelopes have certain structural features which 
distinguish them from other eubacteria and which influ¬ 
ence phage adsorption. Finally, many areas of the taxonomy 
of cyanobacteria are highly problematic and this has led 
to confusion in some host range studies. 

The most important physiological feature of cyano¬ 
bacteria is oxygenic photosynthesis. Like higher plants, 
cyanobacteria possess two photosystems, PSI and PSII, 
that are connected by an inter-system electron transport 
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pathway. Of the cyanobacteria that have been tested, the 
very large majority are capable of taking up and utilizing 
exogenous carbon compounds, but very few are capable of 
dark growth at the expense of exogenous carbon com¬ 
pounds. Thus, cyanobacteria, for the most part, are obligate 
phototrophs. 

A second characteristic feature of many cyanobacteria 
is the possession of macromolecular light-harvesting 
antennae, known as phycobilisomes, that are composed of 
chromophore-bearing phycobiliproteins. It is these phyco- 
biliproteins, together with chlorophyll a, that give cyano¬ 
bacteria their characteristic coloration; blue-green when 
phycocyanin is the major phycobiliprotein and orange-red 
when phycoerythrin predominates. The phycobiliproteins 
are extremely abundant and may represent as much as 
half the total cell protein. A major exception is the lack of 
phycobilisomes in marine Prochlorococcus strains and 
their possession of a chlorophyll a 2 /b 2 light-harvesting 
complex (23). 

Obviously, in the natural environment cyanobacteria 
are subject to a light-dark cycle. During the dark periods 
they generate maintenance energy by the oxidation of 
carbohydrate reserves, usually glycogen, via the pentose 
phosphate pathway. Many species of cyanobacteria— 
including the ecologically important filamentous marine 
cyanobacterium Trichodesmium —are also capable of fixing 
atmospheric dinitrogen. 

Cyanobacteria exhibit a range of patterns of cellular 
organization, from simple unicells that divide by binary 
fission through to multiseriate trichomes with branches 
and differentiated cells. These morphological features 
have been used extensively to produce a cyanobacterial 
taxonomy. However, the classification of cyanobacteria is 
extremely problematic and the factors leading to this are 
well reviewed by Wilmotte (125). For many years cyano¬ 
bacteria, or blue-green algae as they were then known, 
were treated as just one group of the algae, with morpholo¬ 
gical characters being the key to their taxonomy and 
their nomenclature being ruled by the Botanical Code. 
This classical botanical taxonomy was challenged by the 
recognition of the distinction between prokaryotes and 
eukaryotes and the compelling evidence that blue-green 
algae were in fact bacteria and should be subject to the 
Bacteriological Code (108). Accordingly, a new bacterio¬ 
logical taxonomy of the cyanobacteria was published by 
Rippka et al. in 1979 (85). 

Currently, cyanobacteria that have been studied in 
pure culture are placed in five orders (119), but molecu¬ 
lar approaches show that three of these are not neces¬ 
sarily monophyletic (e.g., 50) and only two form single 
coherent lineages. The situation is further complicated 
by the fact that many ecologists still adhere to the Botani¬ 
cal Code when naming cyanobacteria in field samples (124). 
One highly relevant result of this taxonomic confusion 


is that in certain cases it has led to the erroneous inter¬ 
pretation of phage host range studies (110). 

The marine unicellular cyanobacteria are a parti¬ 
cularly important group as far as phage biology is con¬ 
cerned. Also, their comparatively recent discovery and 
simple morphology mean that there are no serious taxo¬ 
nomic problems associated with them. Marine unicel¬ 
lular cyanobacteria assigned to the genus Synechococcus 
and possessing phycoerythrin as their primary acces¬ 
sory light-harvesting pigment were classified as marine 
cluster A (MC-A) and distinguished from marine cluster B 
(MC-B), members of which have phycocyanin as their major 
light-harvesting pigment (120). MC-A Synechococcus and 
Prochlorococcus strains are very closely related, despite 
considerable differences in their light-harvesting apparatus, 
and represent a monophyletic clade (81,116). 

A consideration of the cyanobacterial cell envelope 
is important since this is the site at which the phage 
attaches to the cell and also the potential barrier to the 
injection of phage DNA. The structure of the cyanobac¬ 
terial cell envelope has been reviewed by Gantt (36). 
The peptidoglycan layer surrounding the cytoplasmic 
membrane, although structurally similar to that of Gram¬ 
negative bacteria, is thicker and has a chemical composi¬ 
tion which is closer to that of Gram-positive bacteria. The 
cyanobacterial outer membrane in many species is sur¬ 
rounded by a carbohydrate-enriched fibrous glycocalyx 
sheath. Some bacteria possess an S-layer outside the outer 
membrane. S-layers are composed of two-dimensional, 
monomolecular quasi-crystalline arrays of identical units 
of protein or glycoprotein and have been found in 60 
unicellular strains and five filamentous cyanobacteria 
strains (104). Recently the presence of an S-layer was 
reported for a member of the abundant marine MC-A 
Synechococcus group (93). 


Characterization and Classification 
of Cyanophages 

The discovery and characterization of cyanophages has 
proceeded in two very distinct stages. Firstly, interest 
was focused on cyanophages from freshwater systems 
and there was then a period of comparatively little activity 
until the early 1990s when attention was shifted to 
marine systems. All the cyanophages so far character¬ 
ized fall into the three families of tailed phages with double- 
stranded DNA genomes recognized by the International 
Committee on the Taxonomy of Viruses (ICTV): the myo- 
viruses, siphoviruses, and podoviruses (see chapter 2 for 
a review of phage classificiation by virion morphology). 
Safferman et al. (89) similarly proposed three genera 
for the cyanophages, but these taxonomic suggestions 
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have never been adopted by the ICTV However, the 
non-taxonomic terms derived from the three proposed 
genera—cyanomyovirus, cyanopodovirus, and cyanosipho- 
virus (formerly cyanostylovirus)—are frequently employed 
as useful shorthand for discussing cyanophages. 

There is a growing feeling that there are inherent 
problems in the phenetic approach to phage classification 
(58) and, indeed, a taxonomy based on sequenced genomes 
may thus be more appropriate (88). Certainly there is 
strong evidence that genetic mosaicism and access to a 
common phage gene pool is an important feature of cyano- 
phage evolution (45) and, as a consequence, a phenetic 
classification of cyanophages is probably of very limited 
phylogenetic value. Consequently, while the cyanophages 
will be discussed here in terms of the ICTV-approved 
families, this should not be interpreted as implying any 
phylogenetic relatedness between the cyanophages classi¬ 
fied in this way unless there is additional, nonmorpho- 
logical evidence to indicate shared homologous features. 
Similarly, there is no attempt to assign cyanophages to 
the generic subdivisions of these families. What is intended 
instead is a discussion, here, of the diversity of the cyano¬ 
phages assigned to these families. More comprehensive 
listings and descriptions of cyanophages can be found else¬ 
where (69,110) 

The myoviruses, of which the archetype is the coli- 
phage T4 (reviewed in chapter 18), have long contractile 
tails and a head that exhibits icosahedral symmetry, or 
symmetry which is derived from a basic icosahedral struc¬ 
ture. There is considerable morphological and molecular 
diversity amongst the cyanophages assigned to this family 
and the morphological diversity of marine cyanomyo- 
viruses has been particularly remarked on (112,121). One of 
the marine cyanomyoviruses has been reported to have 
neck filaments (112), which are rare amongst phages in 
general and had previously been reported for cyanophages 
infecting filamentous freshwater cyanobacteria (1, 78). 
Members of the family have been isolated which infect 
filamentous or unicellular hosts. The mol%GC content of 
cyanomyoviruses spans a range of 37% for phage N1 (2) to 
55% for phage AS-1M (98); there is also enormous varia¬ 
tion in genome sizes, with N1 at 37 kb (2) and the marine 
cyanophage S-PM2 at 196 kb (45). The large majority 
of phage isolates infecting MC-A Synechococcus strains 
(63, 112, 121, 128) are myoviruses, as are phages infect¬ 
ing Prochlorococcus marinus strains belonging to the low- 
light clade (109). 

The siphoviruses are distinguished by their long, 
noncontractile tails and are the least frequently isolated 
cyanophages from both freshwater and sea water and are 
the least characterized. The bulk of the cyanosiphoviruses 
so far isolated infect unicellular hosts, although viruses 
assigned to this family have been reported to infect the 
marine filamentous cyanobacterium Lyngbya majuscula 


(47). Again, as with the cyanomyoviruses, there is consider¬ 
able variation in mol%GC content, ranging from 46% for 
S4-L (52) to 70-74% for SI (1). S-l has a genome of 38 kb (1). 

The podoviruses possess heads with icosahedral sym¬ 
metry and short tails. The first cyanophage to be discov¬ 
ered, which caused lysis of several species of cyanobacteria 
from the genera Lyngbya, Plectonema, and Phormidium, 
hence the designation LPP-1, belongs to this family (90). 
Mol%GC contents show rather less variation in cyano- 
podoviruses, ranging from 53-55% for LPP-1 (64) to 
66-67% for SM-1 (91). Genome sizes are typically around 
42 kb for LPP-1 (64) to 48 kb for the marine isolate P60 
(22). Phages infecting the high-light clade of Prochloro¬ 
coccus marinus belong to this family (109). 

Infection of Host Cells 

Host Range 

There have been many problems associated with estab¬ 
lishing the host range of freshwater cyanophages infect¬ 
ing both filamentous and unicellular cyanobacteria. These 
problems arise largely, though not exclusively, from the 
previously mentioned problems with cyanobacterial tax¬ 
onomy and are excellently discussed by Suttle (110). The 
situation with marine cyanophages is not so proble¬ 
matic, particularly with regard to phages infecting strains 
of unicellular cyanobacteria belonging to the MC-A Synecho¬ 
coccus and Prochlorococcus lineage. However, in spite of 
the taxonomic problems there are two consistent obser¬ 
vations that can be made relating to cyanophage host 
range, namely that phages that infect filamentous strains 
do not infect unicellular strains and phages that infect 
marine strains do not infect freshwater strains. 

Host range studies are carried out under laboratory 
conditions and there are a number of factors that make 
the extrapolation to natural assemblages problematic. In 
particular, natural assemblages are usually composed of 
different ecotypes that may vary in their sensitivity to 
infection by a particular phage for a variety of reasons: 
The physiological state of the cell or its position in the 
cell cycle may determine the ability of the phage to 
adsorb (see ‘Adsorption” below). The incoming phage DNA 
might be subject to restriction by endonucleases specified 
by coinfecting phages. There may, of course, be selection 
for mutation to resistance, though this is likely to incur 
a trade-off in fitness. Another reason for differential 
efficiency of infection might be immunity arising from 
lysogeny, and some lytic phages are also able to prevent 
secondary infection by other lytic or temperate phages by a 
process known as superinfection exclusion. 

Detailed studies have been carried out on the host 
range of phages infecting MC-A Synechococcus strains. 
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Waterbury and Valois (121) found considerable variation 
in the host ranges of their Synechococcus phage isolates. 
Some phages would infect as many as 10 of the 13 strains 
tested, whereas others would infect only the strain used 
for isolation. One phage isolated on a MC-A Synechococcus 
strain would infect other MC-A strains, but also a MC-B 
strain (WH8101). None of the phages would infect the 
freshwater strain Synechococcus sp. PCC 6307. Suttle 
and Chan (112) isolated phages on both MC-A and MC-B 
hosts. Host range was not correlated with the geo¬ 
graphical locations where the phages and hosts were 
isolated. Phages from all three phage families were isolated 
that infected MC-B strains, but all the phages capable of 
infecting MC-A strains were myoviruses. One of their 
isolates, a myovirus (S-PWM3), infected a green Synecho¬ 
coccus (presumably MC-B) as well as three MC-A strains. 
Again no infectivity against freshwater strains was 
observed. Thus, myoviruses appear to exhibit broader host 
ranges. 

Lu et al. (63) also isolated phages against phycoerythrin- 
containing MC-A and phycoerythrin-lacking (presumably 
MC-B) hosts. Again they found that phages infecting 
MC-A Synechococcus strains had a broader host range and 
that there were phages infecting hosts from both MC-A 
and B strains. These observations have been extended 
by the discovery by Sullivan et al. that there are marine 
phages that can infect both Prochlorococcus marinus 
and Synechococcus (109). Furthermore, there was a clear 
distinction between phages which infected the strains of 
the high-light clade of Prochlorococcus marinus, which 
were highly strain-specific podoviruses, and those which 
infected strains of Synechococcus and the low-light 
clade of P. marinus, which were broader host range 
myoviruses. One caveat applying to all such studies is 
that they may be very sensitive to which host strain a 
phage was propagated on prior to assessment of host 
range, since the hosts may possess different restriction- 
modification systems. 

Dutta et al. (33) have revisited the concept of the 
“nascent phage” based on experiments with the T4-like 
phage LZ4. Drawing on earlier work—including the 
observation that intracellular phages (T2 and T4) are 
associated with the cytoplasmic membrane (100) and 
that newly released phage carry a membrane fragment, 
loss of which is correlated with cofactorless infection 
(19)—it is proposed that nascent phages are newly syn¬ 
thesized phages which can infect related bacteria that 
lack the normal phage receptor. In this context, it was 
reported for cyanophage LPP-1 that newly formed phage 
particles were found to be closely associated with the 
thylakoids and to remain attached after lysis (105). If 
this membrane association can similarly lead to a non¬ 
specific broadening of the host range such as that pro¬ 
posed for phage LZ4, then such nascent-phage status 
could seriously complicate assessment of cyanophage 


host range and ecological significance (see”Contact Rates” 
below). 

Adsorption 

The first key step in the interaction between a phage and 
a potential host cell is the adsorption of the phage to 
the cell envelope, a process which involves the phage 
adhesins recognizing and binding to specific cell-surface 
receptors, commonly either lipopolysaccharide or protein. 
Consequently, it is surprising that there has been so little 
work done on identifying the receptors recognized by 
cyanophage adhesins, or on characterizing the adhesins 
themselves. Discussion of what has been accomplished 
in the area of cyanophage adsorption follows. 

Adsorption of phage AS-1 was found to be signifi¬ 
cantly dependent on light, though this dependence could 
be reduced by increasing the concentration of Na + ions 
(28). LPS has been implicated in the absorption of the 
cyanomyovirus AS-1 by virtue of the ability of purified poly¬ 
saccharide to inactivate AS-1. AS-1 adsorption protein(s) 
also have been preliminarily identified by the purification 
of a fraction with receptor activity (92). Disruption in 
Anabaena sp. strain PCC 7120 of the genes thought to 
encode undecaprenyl-phosphate galactosephosphotrans- 
ferase ( rfbP) and first mannosyl transferase (rfbZ) led to 
resistance to the obligately lytic phage A-l(L), as well as 
the temperate phage A-4(L) (129). Both these enzymes 
are involved in the synthesis of the O-antigen component 
of LPS and electrophoretic analysis showed that the inter¬ 
ruption of the rfbP and rfbZ genes led to a change in or 
loss of the characteristic pattern-length of the LPS. The 0 
antigen is comprised of serially repeated, strain-specific 
oligosaccharide units. Thus, the currently available evi¬ 
dence implicates LPS in cyanophage adsorption and varia¬ 
tion in the nature of the 0-antigen component as a 
determinant in phage host range. 

The nature of the cell surface is likely to be influenced 
by the nutrient stresses imposed on the cell, particularly 
with respect to proteins involved in nutrient transport. 
It was shown as early as 1940 for a coliphage that the 
adsorption rate constant under optimal growth condi¬ 
tions was more than 60 times greater than under poor 
conditions (30). Consequently both the presence and den¬ 
sity of some receptors might be influenced by nutrient avail¬ 
ability, as is the case for phage X where the receptor is 
induced by maltose. However, the only study on the 
effect of nutrient starvation on cyanophage adsorption 
showed that phosphate depletion had no effect on adsorp¬ 
tion kinetics (126). In contrast it has been reported in the 
case of a coastal strain of Synechococcus that phage would 
only bind to about 10% of the cells, suggesting that only 
a small proportion of the population was expressing the 
particular phage receptor at any particular time (110). 
In contrast, phage were found to attach to all cells in 
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a clonal Synechococcus population (46). It has even been 
suggested that there may be ecological advantages in an 
oligotrophic environment in expressing decoy phage recep¬ 
tors leading to phage adsorption and DNA entry, but resis¬ 
tance to subsequent viral replication and lysis, thus 
permitting the incoming DNA to be used as a valuable 
source of carbon, nitrogen, and phosphorus (35). 

Restriction-Modification Systems 

and DNA Modifications 

Restriction endonucleases are thought to represent a 
mechanism by which bacterial cells can degrade incom¬ 
ing "foreign” DNA and thus resist phage infection even if 
the cell possesses the phage receptor. Certainly a large 
number of restriction-modification systems have been 
characterized in cyanobacteria and such an adsorption- 
then-restriction process would explain the difference in 
efficiency of plating of phage N-l on Anabaena sp. PCC 7120 
and Anabaena variabilis (29). 

Restriction endonucleases encoded by phages can 
also represent mechanisms by which the phage can target 
the degradation of the genome of the infected cell and 
prevent superinfection by other phages lacking the appro¬ 
priate modification. Phage AS-1 has been shown to encode 
a restriction-modification system that can degrade host 
DNA (113). Wilson et al. (128) also observed that EcoRl 
digested the DNA of four phages propagated on the MC-A 
strain Synechococcus sp. WH7803, but failed to digest the 
DNA of a fifth, the myovirus S-WHM1, suggesting that 
it encoded its own restriction-modification system. 

Another mechanism by which a phage can protect 
itself against both its own or the host’s restriction enzymes 
is to substitute, by a modification of the biosynthetic 
pathway, an unusual base (e.g., hydroxymethyl cytosine) 
throughout the genome in place of one of the normal 
bases. This latter approach is adopted by phage T4 and 
provides almost unlimited protection against nucleases 
that recognize unmodified sequences. Protection against 
nucleases recognizing modified sequences is conferred 
by glucosylation of the hydroxymethyl cytosine residues 
(21). AS-1 DNA was shown to contain about 5% of a modi¬ 
fied nucleotide which was not 5-methyldeoxycytidylic 
acid (113) and lack of modification of the replicated phage 
DNA early in the infection cycle may explain the phenom¬ 
enon of phage-induced light (PIL) DNA (48). In addition, 
2-aminoadenine has been found to substitute for adenine 
in the DNA of cyanophage S-2L (53, 54). Many restriction 
endonucleases do not digest MC-A Synechococcus phage 
DNA effectively, suggesting the presence of modified bases 
or the lack or restriction sites (63,128). 

There may also be a strong selection pressure on 
phages to lose potential restriction sites. Such may be 
the case for cyanophages that infect species of Anabaena 
and Nostoc. An analysis of restriction endonuclease 


cleavage of DNA isolated from these phages has provided 
evidence for counter-selection of restriction endonucle¬ 
ase sites. These include sites containing subsequences 
that are methylated by host {Anabaena sp. PCC 7120) meth- 
ylase(s) (10). 

Lysogeny 

Although many cyanophages are obligately lytic, a signifi¬ 
cant proportion are capable of lysogeny and this topic is 
well established for freshwater cyanophages and was exten¬ 
sively reviewed by Sherman and Brown (97) (see chapter 7 
for a discussion of general aspects of lysogeny). It was 
hoped that temperate phages would be useful as tools for 
fine structure genetic mapping, but this hope was never 
realized. An important phenomenon associated with lyso¬ 
geny is that prophage may significantly alter the pheno¬ 
type of the host cell, a phenomenon known as phage 
conversion. As yet, however, there is no clear evidence yet 
for cyanophage “conversion” of host strains. The earliest 
reports on lysogeny focused on phages infecting filamen¬ 
tous strains and eventually Plectonema boryanum and 
phage LPP-2SPI became an accepted system to study (79). 
The SPI strain of P. boryanum can continually liberate 
phages, but induction of a prophage could not be achieved 
with ultraviolet (UV) light, X-rays or mitomycin C, and the 
cells though sensitive to infection by LPP-1 were immune 
to LPP-2. However, Rimon and Oppenheim (83) were able 
to isolate a temperature-sensitive mutant of the phage 
that could be induced at 40 °C in the light. Inhibitor 
studies (25) showed that transcription and translation were 
required for induction, as was light. 

Attempts to induce prophages in MC-A Synecho¬ 
coccus strains by a range of treatments including temper¬ 
ature shock, light shifts, UV and X-irradiation, and 
mitomycin C were not successful (121). No intact pro¬ 
phages have been found in the sequenced cyanobacterial 
genomes. However, the marine MC-A strain Synecho¬ 
coccus sp. WH8102 contains 16 putative phage integrases 
and three possible integrase regulators (80). Prochloro- 
coccus strains MIT9313 and MED4 contain four and one, 
respectively (86). These phage integrase genes may repre¬ 
sent the remnants of once fully functional prophages. 
Sode et al. (107) isolated a temperate phage infecting 
a marine Synechococcus strain, though not of the MC-A 
group, which was inducible by heavy metals, particularly 
copper (106). 

More recently attempts have been made to detect 
lysogeny in natural assemblages of MC-A Synechococcus. 
One study was carried out on a Synechococcus bloom in 
a pristine fjord in British Columbia, Canada (77). The 
samples were incubated with and without mitomycin C 
and the abundances of Synechococcus and infectious 
cyanophage (using strain WH7803 as a host) were esti¬ 
mated. Lysogenic phage production was estimated from 
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the difference in cyanophage abundance between the 
mitomycin-C-treated samples and the controls. On this 
basis induction of prophages occurred in 0.6% of the Syne- 
chococcus population. This represents a minimum value as 
only phage that infect strain WH7803 would be counted 
and mitomycin C may not be an effective inducer of 
prophages in Synechococcus spp. In a second study, McDaniel 
et al. (70) analyzed lysogeny using mitomycin C during an 
annual cycle in Tampa Bay, Florida. The frequency of lyso- 
gens was inversely correlated with Synechococcus abun¬ 
dance. Lysogens were primarily detected during the late 
winter months, though lysogens were also detected in 
August, preceding a secondary autumnal Synechococcus 
bloom. 

Autoplaquing refers to the appearance of plaques 
within a bacterial lawn that has not been infected with 
phage, and is thought to be due to the spontaneous 
induction of a prophage. The phenomenon of autoplaque 
formation has been observed for about 50% of clonal 
Nodularia spumigena isolates from the Baltic Sea with 
either cyanomyoviruses or cyanosiphoviruses being present 
within the cell lysates; autoplaque formation was associ¬ 
ated with senescent Nodularia cultures and cultures 
exposed to high light/temperature (C. Jenkins and P. K. 
Hayes, personal communication). Studies with the fila¬ 
mentous marine cyanobacterium Phormidium persicinum 
(Provasoli strain) have shown it to be a lysogen and that 
the prophage could be induced with mitomycin C or UV 
(76). This led to the suggestion that the rapid disappear¬ 
ance of Trichodesmium blooms and the lysis of laboratory 
cultures following exposure to stresses such as a sudden 
temperature increase (75) could be explained in the 
same way. 

Pseudolysogeny 

There is evidence that obligately lytic Synechococcus 
phages can enter the pseudolysogenic state (126), a phage- 
host relationship in which a phage-infected cell grows 
and divides even though its virus, though metabolically 
inactive, is pursuing a lytic infection. A commonly cited 
example is phage T3 infection of F + Escherichia coli under 
starvation conditions (13). When the obligately lytic phage 
S-PM2 (a myovirus) was used to infect Synechococcus sp. 
WH7803 cells grown in phosphate-replete or phosphate- 
deplete media it was found that there was an apparent 80% 
reduction in the burst size under phosphate-deplete condi¬ 
tions. However, a more detailed analysis showed that 100% 
of the phosphate-replete cells lysed, compared with only 
9% of the phosphate-deplete cells, suggesting that the 
majority of phosphate-deplete cells were pseudolysogens. 
Similar observations were made with two other obligately 
lytic Synechococcus myoviruses, S-WHM1 and S-BM1. 

Temperate phages can alter the phenotype of their 
lysogenic host via the phenomenon of phage conversion 


(see “Lysogeny” above). In the pseudolysogenic state both 
lytic and temperate phages might be able modify the pheno¬ 
type of the host cell. There is one example in the litera¬ 
ture (55). T7 is capable of forming a pseudolysogenic 
relationship with strains of Shigella dysenteriae. S. dysente- 
riae is normally not capable of utilizing mannitol (Mann - ) 
or lactose (Lac - ), but T7 pseudolysogens become Mann + 
Lac + , which is attributed to the effect of the phage endo- 
lysin altering the permeability of the cell envelope. It may be 
appropriate to coin the term “pseudolysogenic conversion” 
to refer to the temporary phenotypic alteration in the 
phenotype of a pseudolysogenic host. In this context it 
is worth noting that the marine cyanophage P60 carries 
the phoH gene, which in other systems is associated with 
the host’s reponse to phosphate starvation (22). 

Effects on Host Cell Physiology 

Effects on Photosynthesis and Respiration 

A key feature of the cyanobacteria is their oxygenic photo¬ 
synthesis and consequently much interest has been 
focused on the impact of phage infection on the host’s photo¬ 
synthetic metabolism and, conversely, the dependence of 
phage replication on photosynthesis. Obviously, photo¬ 
synthesis is the primary source of energy for phage repli¬ 
cation and assembly and for the fixed carbon for the 
biosynthesis of nucleotides. However, there are more 
subtle points at which the two processes are likely to inter¬ 
act. For example, phycobiliproteins represent a very substan¬ 
tial component of total cell protein, thus a continued 
synthesis of phycobiliproteins will represent a drain on 
resources for phage replication and existing phycobili¬ 
proteins represent a potential source of amino acids for 
the biosynthesis of phage proteins. However, phycobilipro¬ 
teins are also components of the light-harvesting antenna 
for photosynthesis. 

Cyanobacteria in the natural environment also are 
subject to a light-dark cycle and it is important to know 
how phage replication will be affected by the switch from 
light to dark metabolism (see below). Dark metabolism 
in cyanobacteria relies on energy production by the 
oxidative pentose phosphate pathway in which glucose- 
6-phosphate dehydrogenase is a key enzyme. There are 
several reports of the activity and properties of glucose-6- 
phosphate dehydrogenase being altered following phage 
infection (6, 9, 102) and its normal susceptibility to redox 
control has been reported to become uncoupled during 
phage infection of Synechococcus sp. PCC 6301 (27). 

The photosynthetic apparatus of cyanobacteria, like the 
chloroplasts of higher plants, employs two photosystems: 
PSI and PSII (11, 41). Light energy harvested by PSII is 
used to drive the photolysis of water and an excited 
electron is transferred from chlorophyll P 680 to pheophytin, 
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which reduces a bound plastoquinone molecule 0 A . This 
in turn reduces a second quinone Q B . From here elec¬ 
trons flow through the intersystem chain to PSI, concomi¬ 
tantly generating ATP. In PSI a second photon-induced 
photochemical charge separation occurs at chlorophyll 
P 7 oo and the electron is passed through a series of electron 
carriers to ferredoxin and thence to NADP + . According 
to the energy needs of the cell, electrons can also flow 
from ferredoxin back into the intersystem electron transport 
pathway to enhance ATP production, a process referred 
to as cyclic photophosphorylation. 

An important feature of the photosynthetic process 
that would affect phage replication is the fact that the 
water-splitting photochemistry of PSII produces various 
radicals and toxic oxygen species that cause damage to 
PSII. At the core of PSII lies a heterodimer of two related 
proteins, D1 and D2, which binds the pigments and cofac¬ 
tors necessary for primary photochemistry. During active 
photosynthesis Dl, and to a lesser extent D2, turns over 
rapidly as a result of photodamage and is replaced by 
newly synthesized polypeptides in a repair cycle. When the 
rate of photoinactivation and damage of Dl exceeds the 
capacity for repair, photoinhibition occurs, resulting in 
a decrease in the maximum efficiency of PSII photo¬ 
chemistry (8). 

There are two aspects of photoinhibition that are rele¬ 
vant to a consideration of cyanophages. Firstly, the produc¬ 
tion of radicals and toxic oxygen species may directly 
damage components of the nascent phage. Secondly, a func¬ 
tional PSII may be required to provide the energy for 
the process of phage replication. It should be remembered 
that cyanobacteria in the natural environment are likely 
to be nutrient stressed and this will significantly enhance 
the likelihood of photoinhibition. Phage T4 possesses a 
variety of mechanisms that ensure host gene expression is 
shut down shortly after infection. Were cyanophages to 
adopt a similar strategy, then the PSII repair cycle would 
cease with the failure to synthesize new undamaged 
Dl. The central importance of this particular topic is 
made clear by the observation that genomic studies on 
a marine cyanomyovirus S-PM2 indicate that it encodes 
the Dl and D2 proteins of PSII (67), which suggests that 
the phage may ensure a continued repair cycle for PSII 
(see below). The occurrence of genes encoding Dl and 
D2 proteins is widespread amongst independent marine 
cyanomyoviruses isolated from geographically distinct 
provinces (A. Millard, personal communication). This is 
not a universal strategy, however, since the marine cyano- 
podovirus P60, which has a much smaller genome that 
has been fully sequenced, does not encode Dl or D2. 

Many of the early studies on the relationship 
between phage infection and photosynthesis relied heavily 
on the use of inhibitors such as DCMU and CCCP, whose 
mode of action was not clearly established at the time. 
CCCP is a protonophoric uncoupler, which blocks both 


photosynthetic and respiratory ATP production, but is 
also now known to inhibit intersystem photosynthetic 
electron transport by causing futile cycling of electrons 
around PSII (94). DCMU inhibits photosynthetic electron 
transport at the acceptor side of PSII by binding to 
the Dl protein of PSII competing with quinone for binding 
to the Ob site and thereby inhibiting electron trans¬ 
port from the primary quinine acceptor, Q A , to the second¬ 
ary quinone acceptor, 0 B - However, this inhibition of 
photosynthetic electron transport will have a number of 
pleiotropic effects which make it difficult to establish 
whether any effects of DCMU are due to its primary 
effect on electron transport or its secondary effects on 
other processes such as redox signaling. 

In this context it is important to mention that DCMU 
can effect the transcription of the multiple psbA genes 
(encoding Dl isoforms) in cyanobacteria (95, 103) and 
may have a more general effect on the translation machin¬ 
ery (95). DCMU, somewhat counterintuitively, inhibits 
Dl turnover in cyanobacteria (20, 42), but nevertheless 
has a protective effect on PSII, reducing photoinhibition, 
probably by altering the conformation of the Q B site of 
Dl such that acceptor-side photoinhibition does not 
occur (74). It is in the light of these concerns about 
Dl turnover and the primary and secondary effects of 
DCMU that early experiments on the relationship 
between phage infection and photosynthesis should be 
considered. 

In the case of phages infecting unicellular cyano¬ 
bacteria, the general trend is for there to be a variable but 
usually substantial dependence of the infection cycle on 
the continued photosynthetic activity of the infected 
host. An almost complete dependence of phage replica¬ 
tion on photosynthesis was observed with phage AS-1 (a 
lytic myovirus), which infects the unicellular cyano¬ 
bacterium Synechococcus sp. strain PCC 6301 (formerly 
Anacystis nidulans) (4). Darkness, post-infection, did not 
completely abolish phage replication, though the yield 
was reduced to 2% of that obtained with light-incubated 
cells. DCMU reduced the phage yield by 73%, for which 
the simplest explanation is that cyclic photophosphory¬ 
lation via PSI could support a reduced level of phage 
production. The uncoupler CCCP, not surprisingly, comple¬ 
tely abolished phage production. 

Cyanophage AS-1 was found to progressively inhibit 
oxygen evolution in infected cells, with a rate of less 
than 10% that of uninfected cells at the time that lysis 
was commencing. This was coupled with a progressive 
reduction in PSII-mediated photosynthetic electron flow, 
whereas the activity of PSI was unaffected by the infection 
(114). The target of PSII inhibition is probably at the level 
of the secondary acceptor, 0 B (114). However, perhaps 
the most significant observation was that the turn¬ 
over of the Dl protein was inhibited in infected cells 
(114). This suggests that the progressive impairment of 
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photosynthesis as reflected by oxygen evolution was due to 
accumulating photodamage and consequent photoinhibi¬ 
tion of PSII. Thus, the infected cells presumably lacked the 
normal D1 repair cycle of uninfected cells and AS-1 was not 
providing a phage-encoded repair cycle. 

A very different phage-host relationship was 
observed following infection of Microcystis aeruginosa 
strain NRC-1 with the cyanopodovirus SM-1 (65). Photoassi¬ 
milation of carbon dioxide was not significantly inhi¬ 
bited prior to lysis of the cells, suggesting that the cells 
were not sensitive to photoinhibition and must have 
a functioning PSII repair cycle. Furthermore, with¬ 
drawal of carbon dioxide, absence of light or presence of 
DCMU (10~ 6 M) completely abolished phage replication and 
even led to the loss of infectious centers. DCMU, which 
permitted a degree of phage replication in the case of AS-1, 
may have exerted such a strong effect with SM-1 because 
the diversion of electrons into cyclic photophosphory¬ 
lation would have prevented the production of NADPH 
required for carbon dioxide fixation and phage replication. 
Thus, replication of SM-1 was completely dependent on 
a fully functional photosynthetic apparatus and could not 
be even partially sustained by cyclic phosphorylation 
or dark respiration. It should be noted that there is consid¬ 
erable taxonomic confusion surrounding strain NRC-1, 
which is well discussed by Suttle (110); it likely corresponds 
to Synechococcus sp. PCC 6911. 

The third cyanophage infecting unicellular cyanobac¬ 
teria for which such studies have been carried out is 
AS-1M (a lytic myovirus) (96). When Synechococcus cedrorum 
(= Synechococcus sp. strain PCC 6908) was infected with 
cyanophage AS-1M, carbon dioxide fixation and oxygen 
evolution continued at high levels throughout the latent 
period, declining only immediately prior to lysis, as was 
the case with SM-1. The similarity to SM-1 infection 
extended to the complete lack of phage replication in 
the presence of DCMU, or the absence of carbon dioxide 
or light. However, the presence of exogenous glucose in 
the presence of DCMU, or in the dark, restored phage 
yields to about 10% of the light-incubated controls, which 
suggests a respiratory production of NADPH given light 
plus DCMU and a respiratory production of ATP given 
dark plus glucose. 

Studies with filamentous cyanobacteria, as opposed 
to unicellular strains, have led to the observation of some¬ 
what different relationships between phage infection 
and photosynthesis. Cyanophage N-l (a myovirus) infection 
of Nostoc muscorum showed that phage replication and 
release was largely dependent on photosynthetic meta¬ 
bolism throughout infection rather than there being a par¬ 
ticular critical time during which light was required for 
the phage replication cycle (3). carbon dioxide fixation 
began to decline some halfway through the infection cycle, 
however, and was absent immediately prior to lysis. 
Use of the inhibitor DCMU at a concentration (10~ 6 M) 


which completely inhibited carbon dioxide fixation led to 
burst sizes that were 25% of control values. This result 
suggests that cyclic photophosphorylation alone can 
sustain the replicative cycle. Cyanophage N-l infected cells 
have also been shown to have a reduced ferredoxin: 
NADP + oxidoreductase activity, which may be associated 
with cyclic photophosphorylation (51). Cells with DCMU 
(10 5 M) in dark still yielded about 2% of phage produc¬ 
tion compared with uninhibited cells in the light, thus 
indicating that oxidative phosphorylation could still 
support phage replication albeit at very reduced levels. 
This dark production of phage was completely abolished 
by the uncoupler CCCR 

In keeping with these observations, photosynthetic 
oxygen evolution was found to begin to decline shortly 
after infection, which may be a reflection of an impaired 
turnover of the PSII D1 protein leading to photoin¬ 
hibition. In contrast, respiratory oxygen consumption was 
found to be unaffected or even increased and there was 
a transient increase in the activity of the key respiratory 
enzyme, glucose-6-phosphate dehydrogenase (6, 102), 
that was accompanied by a decline in glycogen stores 
(102). Phycocyanin was also progressively degraded during 
the course of N-l infection (6, 51). This loss of phyco¬ 
cyanin could be interpreted as a viral strategy to reduce 
photon pressure on an already inhibited PSII, thus mini¬ 
mizing further inhibition and also reducing the potential 
for production of toxic oxygen species. Another, not neces¬ 
sarily exclusive possibility is the use amino acids, produced 
by such degradation, for phage protein synthesis (6, 51). 
A markedly different effect on carbon dioxide fixation 
was observed following infection of Plectonema boryanum 
with phage GUI, which led to a rapid and complete cessation 
of fixation (39). 

A complicated pattern emerges when comparing the 
same phage infecting different hosts, or different phages 
infecting the same host. In the case of LPP-1 infecting 
the host Plectonema boryanum, Sherman and Haselkorn (99) 
showed that even in the dark a burst size of 15% of 
that observed with light-incubated controls could be 
obtained. The PSII inhibitor DCMU reduced the burst size 
by 60-70% indicating that, as in the case of N-l, cyclic 
photophosphorylation alone could support the replicative 
cycle, albeit at a reduced level. Very different results were 
observed when the same phage, LPP-1, was used to infect 
a different host, Phormidium uncinatum (14). The phage 
yield was unaffected by treatment of infected cells with 
DCMU or by transfer to the dark. PSII activity began 
to decline, presumably due to photoinhibition, 9 hours post¬ 
infection. 

In contrast to the interaction of LPP-1 with Plectonema 
boryanum, infection by phage LPP-1 (G) was indepen¬ 
dent of photosynthesis (38). Cyanophage LPP-1 (G) was 
produced at the same yield in heterotrophic conditions 
(dark, glucose) as in photoautotrophic conditions, though 
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aerobiosis was required for cyanophage replication in 
the dark. Exogenous glucose was not required for the 
cyanophage replication in the dark in heterotrophically 
grown cells. In photoautotrophically grown cells, the 
maximum burst size in the dark and with glucose was 
delayed for a period corresponding to glucose-uptake 
induction. Of the photosynthesis parameters tested, only 
carbon dioxide photoassimilation was affected during 
cyanophage LPP-l(G) infection under photoautotrophic 
conditions. 

As a postscript it is worth noting the interesting 
link between phage infection and thylakoid biogenesis. 
In E. coli, expression of the protein PspA (phage shock 
protein) is strongly induced by phage infection (34) and 
a homolog of PspA, VIPP1, is essential for thylakoid devel¬ 
opment in Arabidopsis thaliana (57). In two freshwater 
cyanobacteria, Synechocystis and Anabaena, both a VIPP1 
and a pspA gene is present, and the VIPP1 gene is essential 
to thylakoid development as in A. thaliana (123). The VIPP1 
gene is thought to have originated from a gene duplica¬ 
tion of pspA and thereafter acquired its new function 
that involves a C-terminal extension—that discriminates 
VIPP1 proteins from PspA—which is important for its 
function in thylakoid formation (123). 

Other Aspects of Metabolism 

Among the other areas of host physiology that are 
affected by phage infection is nutrient transport. Blaska 
et al. detected minimal uptake of glucose and glucose- 
6-phospate by uninfected cells of Anacystis nidulans, but 
uptake increased markedly after AS-1 infection, peaking 
6 hours post-infection (16, as cited by 69). The host cell's 
response to energy limitation may also be affected. A. nidu¬ 
lans normally accumulates guanosine 3'-diphosphate- 
5'-diphosphate (ppGpp) upon shift from light to dark or 
upon treatment with uncouplers. This response, however, is 
abolished following AS-1 infection, indicating that the 
cell’s normal control over its response to energy starvation 
has been lost (17). 

Aspects of nitrogen and nucleic acid metabolism may 
also be influenced by cyanophage infection. Obviously 
there is a demand for nucleotides for phage replication 
and amino acids for phage protein biosynthesis. Rimon 
and Oppenheim (84) have implicated phage genes in the 
shutoff of host protein synthesis during LPP-2SPI infection. 
The need for nucleotides can, in part, be met by the degra¬ 
dation of host nucleic acids and in the case of AS-l-infected 
A. nidulans there was a 15- to 20-fold increase in DNase 
and RNase activities at different times post-infection (115). 
However, in AS-l-infected cells there was found to be a 
4-fold increase in DNA content compared with uninfected 
cells, implying that breakdown products are inadequate 
to meet the total biosynthetic requirements. The marine 
cyanophage P60 has been shown to encode several enzymes 


involved in nucleotide biosynthesis (22), suggesting that 
it enhances or modifies the infected cell's biosynthesis 
of nucleotides. Macromolecular processes may also be 
affected and AS-1 infection has been associated with 
an inhibition of post-maturational cleavage of 2 3 S rRNA (18). 

The nitrogen economy of the infected host may be 
altered and, in the case of LPP-1 infection of Phormidium 
uncinatum, a steep rise in the activity of nitrate reduc¬ 
tase can be detected both in the light and in the dark (15). 
It has been reported that sequences homologous to the 
Klebsiella pneumoniae nifH D and K genes were detected 
by dot-blot hybridization in the genome of temperate 
phages of the NP-1T series that infect Nostoc sp. 39 (73). 
Furthermore, lysogens of Nostoc sp. 39 carrying these 
temperate prophages exhibited considerably enhanced 
nitrogenase activity. These surprising observations are 
supported by the sequencing of the genome of phage AN15, 
which has revealed the presence of a nif gene (A. Baker, 
W. H. Wilson and D. G. Adams, personal communica¬ 
tion), though the physiological significance of these phage- 
encoded nif genes has yet to be established. 

Ecological Significance 

Contact Rates 

It is axiomatic that the frequency with which phages 
encounter and infect susceptible hosts in the natural 
environment will determine whether phages exert a signi¬ 
ficant selection pressure on the hosts, in terms of both 
overall abundance and genetic diversity (for additional 
discussion on this quantitative and qualitative impact of 
phages on bacteria, see chapters 5 and 33). By far the most 
work on estimating cyanophage-host contact rates has 
been done with cyanophages infecting MC-A Synecho- 
coccus strains. The assumptions underlying the different 
approaches and the sometimes contrasting predictions 
have been discussed by Mann (66). 

Waterbury and Valois (121) used a theoretical 
approach based on the diffusivity of phage particles and 
physical approaches to coagulation, coupled with observa¬ 
tions on phage and host population densities, to estimate 
contact rates between phage and marine Synechococcus 
hosts in Woods Hole Harbor. On this basis it was calcu¬ 
lated that between 0.005% (at the end of the spring bloom) 
and 3.2% (during a Synechococcus peak in July) of the 
Synechococcus population was contacted, and assumed to 
be infected, on a daily basis. Contacts rates have also been 
calculated using a model for the interaction of viruses 
with hosts based on diffusive transport that allows inclu¬ 
sion of the motion of water and host. When coupled to 
estimation of phage decay rates this approach has led to 
somewhat higher estimates of contact rates. If a burst size 
of 50 was assumed, as many as 33% of the Synechococcus 
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population would have to have been lysed daily at one of 
the sampling stations (111). A subsequent study using the 
same approach (37) yielded figures for the proportion of 
the Synechococcus community infected ranging from 1-8% 
for offshore waters. In nearshore waters only 0.01-0.02% 
of the Synechococcus were lysed on a daily basis. In all 
cases the efficiency of infection was very low, with only 
1.01-3.18% of contacts leading to infection. 

The observations for nearshore waters are in good 
agreement with those of Waterbury and Valois (121). 
However, considerations of contact rates between phage 
and host, phage decay rates, and phage production become 
infinitely more complicated following the discovery that 
there are phages which can infect both MC-A Synechococcus 
and Prochlorococcus marinus (109), except in regions of 
the oceans where the temperature falls below the threshold 
for the occurrence of P. marinus. The complexity may be 
further enhanced if nascent cyanophages do have a greater 
host range than mature phage particles (discussed above). 

Another complication in the calculation of contact 
rates in natural assemblages arises from the accurate esti¬ 
mation of the true number of infectious phages. Assays 
with a single host are likely to lead to an underestima¬ 
tion of infectious phage. Another significant factor is the 
impact of solar radiation on the infectivity of natural 
cyanophage assemblages. Decay rates as high as 0.75 per 
day have been measured for phages infecting MC-A Synecho¬ 
coccus in the surface mixed layer of the Gulf of Mexico 
(37). The UV-B component of sunlight leads to phage 
inactivation primarily by the formation of pyrimidine 
dimers and much of this damage may be repaired by photo¬ 
reactivation involving a post-infection host cell repair 
mechanism. This is a well-established phenomenon in 
cyanophage-host systems (5, 7, 49, 59, 101) and such 
processes also occur in natural communities (122). The 
enhanced ability of a host to reactivate damaged phage 
if the host cell itself has been irradiated (Weigle reactiva¬ 
tion) is also likely to be important in natural assemblages 
and has also been reported to occur in cyanobacteria (59). 
The use of unirradiated hosts to assess cyanophage 
abundance in natural environments, particularly those 
with high insolation rates, consequently may lead to a signi¬ 
ficant underestimate of cyanophage abundance. 

In order to assess the impact of phage on host dynamics 
and diversity it is important to establish at what point 
the contact rate becomes a significant selection pressure 
on a population, leading to either the succession of intrin¬ 
sically resistant strains or the appearance of resistant 
mutants. This threshold, which would presumably repre¬ 
sent the point at which the phages begin to exert a signifi¬ 
cant selection pressure on the host, is the point at which 
the product of contact rate (assumed to be equal to the 
infection rate) and the burst size equals the product 
virus decay rate and the virus population density (66). 
Taking a range of experimentally determined values, the 


threshold would occur at between 10 2 and 10 4 cells ml -1 . 
This is in agreement with data from natural Synechococcus 
populations that suggest a genetically homogeneous popu¬ 
lation would start to experience significant selection 
pressure when it reached a density of between 10 3 (111) 
and 10 4 cells ml -1 (87, as cited by 110). The use of fluores- 
cently labeled phage probes has also shown that phages 
present at comparatively low abundance can control 
microbial community structure (46). Such ideas cannot be 
easily extended to filamentous or colonial cyanobacteria 
where phage release could occur within the filament or 
colony. 

Impact on Natural Assemblages 

Much of the original interest in cyanophages arose 
from their potential as agents to control nuisance blooms 
of freshwater cyanobacteria. Cyanobacterial blooms can 
cause economic and social impacts by a variety of mecha¬ 
nisms including toxin production, impairment of water 
treatment processes, and eutrophication. Attempts to use 
cyanophages to prevent or limit blooms of cyanobacte¬ 
ria have been reviewed by Martin and Benson (69) (see also 
chapter 48, which reviews the related phenomenon 
known as phage therapy). Some control of Plectonema bory- 
anum populations in outdoor pond facilities was achieved 
with phage LPP-1 (31, as cited by 69), though the phage 
was most effective when added prior to the formation of 
the bloom. However, the continued occurrence of nuisance 
blooms and the lack of use of cyanophages to control them 
is a testament to the lack of success with this approach. 

There are, however, continuing reports of the termi¬ 
nation of freshwater and marine blooms being associated 
with phages. The involvement of cyanophages was sug¬ 
gested to be a factor in the decline of Aphanezomenon 
flos-aquae blooms in shallow, eutrophic lakes in Manitoba 
(26). van Hannen (117) described the growth of a cyano¬ 
bacterial bloom, dominated by Oscillatoria limnetica and 
Prochlorothrix hollandica, in two laboratory-scale enclo¬ 
sures of water from the shallow, eutrophic Lake Loosdrecht 
(the Netherlands), and attributed the population collapse 
of the filamentous 0. limnetica to viral lysis. In a follow¬ 
up to these observations, Gons et al. (44) described how in 
similar laboratory-scale enclosures of water from the 
same lake, the predominating filamentous cyanobacteria 
grew vigorously for 2 weeks, but then their populations 
simultaneously collapsed, whereas coccoid cyanobacte¬ 
ria and eukaryotic algae persisted. The collapse coincided 
with a short peak in the counts of virus-like particles 
and transmission electron microscopy showed myoviruses, 
with isometric heads of about 90 nm outer diameter 
and tails more than 100 nm long, that occurred free, 
attached to and emerging from cyanobacterial cells. 
Expansive blooms of the toxic cyanobacterium Lyngbya 
majuscula were observed in two shallow-water regions of 
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Moreton Bay, Australia, that were subject to rapid bloom 
declines (8 to <1 km“ in < 7 days) (47). Virus-like particles 
produced by decaying L. majuscula were observed using elec¬ 
tron microscopy and appeared to be siphoviruses. The induc¬ 
tion of temperate prophages has been proposed as the 
underlying mechanism of bloom termination of the ecologi¬ 
cally important marine diazotrophic cyanobacterium, 
Trichodesmium sp. (75). 

The most direct method of detecting cyanobacterium- 
phage interactions is to use transmission electron micro¬ 
scopy to determine the proportion of cells that contain 
visible mature phages. An assessment of the contribution 
of phage to host mortality can then be made based on esti¬ 
mates of the proportion of the infection cycle during 
which phage particles are visible and the assumption 
that mortality due to infection is twice the number of 
infected cells. This approach was used with marine Syne- 
chococcus by Proctor and Fuhrman (82). They found 
that, depending on the sampling station, between 0.8% and 
2.8% of cyanobacterial cells contained mature phage and, 
assuming that phage particles were only visible for 10% 
of the infection cycle, it was estimated that the percentage 
of infected cells was actually 10-fold greater than the 
observed frequency. Thus, viral infection could account 
for as much as 56% of marine Synechococcus morta¬ 
lity. However, many of the underlying assumptions in this 
approach have been questioned (66,110). 

Much attention these days is focused on the questions 
of whether phage affect the genetic diversity of their 
hosts and whether this in turn will affect phage diversity. 
In the context of this idea, different strains of the same 
species that are sensitive to infection by different phages 
and that display different growth rates in a particular set 
of environmental conditions could be treated as different 
species in terms of competition and succession. Support for 
the idea of a correlation between host and phage gene¬ 
tic diversity comes from studies on phages infecting 
marine MC-A Synechococcus strains. Maximum Synecho¬ 
coccus myovirus diversity in a stratified water column was 
correlated with maximum Synechococcus population den¬ 
sity (127) and changes in phage clonal diversity were 
observed from the surface water down to the deep chloro¬ 
phyll maximum in the open ocean (130). Distinct cyano- 
myovirus population structures were found in estuarine 
water versus open ocean (130). Temporal changes in the 
relative abundance of specific cyanophage g20 geno¬ 
types was observed during the summer months in 
Rhode Island coastal water (68). All this evidence sug¬ 
gests the presence of different host ecotypes in each envi¬ 
ronment and a dynamic interaction between cyanophage 
and host. 

In a study of an oligotrophic environment (Gulf of 
Aqaba, Red Sea) over an annual cycle, the Synechococcus 
diversity was monitored using the rpoCl gene (M. Miihling, 
N. Fuller, A. Millard, D. J. Scanlan, A. Post, W. H. Wilson, 


D. Marie, and N. H. Mann, unpublished results). There 
was considerable diversity in spring, with as many as 
12 Synechococcus genotypes present, and this was 
followed by a marked decline in diversity toward the 
summer and autumn, when only one or two genotypes 
dominated, respectively. In the following winter months 
there was an even greater Synechococcus diversity than 
there was in spring, with as many as 24 genotypes 
present. The genetic diversity in the co-occurring cyano¬ 
phage population was monitored using the cyanomyo- 
virus g20. The seasonal changes in cyanomyovirus 
diversity paralleled that of Synechococcus, with the greatest 
diversity in spring (28 genotypes) and winter (22 genotypes). 
However, cyanophage diversity was not reduced as 
much as Synechococcus diversity during the summer (17 
genotypes) and autumn (15 genotypes). Given the short 
half-life of phages in the surface layers, the presence of 
cyanophage of multiple genotypes in the water column at 
times when the host population was dominated by only one 
or two genotypes indicates that cyanophage of more 
than one genotype were capable of infecting the domi¬ 
nant Synechococcus strains. However, the most abundant 
and second most abundant cyanophages (based on g20 
clones) showed, respectively, a parallel pattern of abun¬ 
dance to that of the most abundant and second most abun¬ 
dant Synechococcus clones, which dominated the summer 
and autumn maxima, suggesting a specific cyanophage- 
host relationship. 

Horizontal Gene Transfer 

In addition to the original interest in cyanophage as 
agents of biological control there was the hope for their 
potential as genetic tools, in same way as phages 
such as X and PI have contributed to the elucidation of 

E. coli genetics. Unfortunately, as with biological control, 
cyanophages as genetic tools have not fulfilled early 
hopes. However, there is growing evidence, almost entirely 
from studies with marine cyanophage host systems, that 
cyanophages are important vectors of a variety of forms 
of horizontal gene transfer in natural assemblages. Hori¬ 
zontal gene transfer in general is a major factor in microbial 
evolution (56) and the phage-mediated transduction of 
both chromosomal and plasmid DNA has been reported in 
a variety of aquatic environments (71). Potentially phage 
can act not only as vectors to transfer DNA between hosts 
via generalized and specialized transduction, but poten¬ 
tially can recombine with other phages during mixed 
infections. (See chapter 33 for broader discussion of hori¬ 
zontal transfer between bacteria found within marine envi¬ 
ronments.) Phage might also acquire genes from their hosts, 
which could either alter their properties during a lytic infec¬ 
tion or lead to phage conversion during lysogen formation 
(see chapters 27 and 47 for further discussion of lysogenic 
conversion). 
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Some of the clearest evidence for horizontal gene trans¬ 
fer has come from recent studies on the genomes of 
marine Synechococcus and Prochlorococcus strains (32, 80, 
86). For the marine phages to contribute to such transfer 
they must be able either to package host DNA into the 
phage capsid or incorporate host genes into the phage 
genome. Evidence has been obtained, in the case of the 
cyanomyovirus S-PM2, that approximately 1 in 10 5 phage 
particles contain a host marker gene in their capsids 
(24). The genome of Prochlorococcus SS120 lacks the gene 
for deoxyribopyrimidine photolyase, which is present in 
other cyanobacteria (32). Instead, it encodes a pyrimidine 
dimer-specific glycosylase, which is absent in other cyano¬ 
bacteria and is speculated to have been acquired from 
a phage genome. A similar reasoning applies to the 
gene encoding a type I restriction-modification system. 

Although the genome of Synechococcus sp. WH8102 
does not currently contain any prophages, there is evi¬ 
dence that is has been extensively altered in the past via 
horizontal gene transfer, probably involving phages (80). 
The genome contains 16 probable or possible phage inte- 
grase genes and three putative integrase regulators that 
may represent the fossil remnants of prophages in the 
ancestral genomes. There are also putative integrases in 
genomes of the Prochlorococcus strains MIT9313 and 
MED4 (86). Many of the multiple putative phage inte¬ 
grases in Synechococcus sp. WH8102 occur in regions of 
the genome with an anomalously low mol%GC content 
and an atypical trinucleotide composition suggestive of 
genomic islands. Several of the genes found in these 
potential genomic islands appear to be involved in the 
carbohydrate modification of the cell envelope, for example 
glycosyltransferases and enzymes involved in the syn¬ 
thesis of sialic acid (80). The nature of the cell surface is 
a critical determinant in phage attachment and susceptibil¬ 
ity to grazing and thus is a target for intense selection 
pressure. One possibility is that Synechococcus sp. WH8102 
may use these enzymes to modify the cell envelope so as 
to evade grazers or phages. Support for this idea comes 
from the observations that several of the regions of the 
genomes of marine Synechococcus and Prochlorococcus 
strains, which are suggested to have been acquired by hori¬ 
zontal gene transfer or lost by deletion events, encode genes 
involved in lipopolysaccharide and/or surface polysac¬ 
charide biosynthesis (86). The discovery of cyanomyoviruses 
that can infect strains of Synechococcus and P. marinus 
extends the potential for horizontal gene transfer between 
these genera. 

Horizontal gene transfer may also involve recombina¬ 
tion between phages or the acquisition of host genes 
by phages. There is clear evidence from the sequencing 
of the marine cyanomyovirus S-PM2 that their genomes 
encode a number of genes whose closest homologs are in 
cyanobacterial genomes (this laboratory, unpublished 
results). Furthermore, S-PM2 encodes the psbA and psbD 


genes of PSII, which must have been acquired from a host 
Synechococcus (67). 

Conclusions 

All the cyanophages so far isolated fall into just three of 
the recognized families of phages: the tailed phages with 
double stranded DNA genomes. However, it is impossible 
to generalize about cyanophages as they exhibit enor¬ 
mous diversity in terms of morphology, genome size, mol% 
GC composition, genetic relatedness and the effects of 
infection on host cell physiology. There are several impor¬ 
tant questions remaining to be answered about cyano¬ 
phages and the impact on their cyanobacterial hosts. 
Genomic studies on marine cyanophages are beginning 
to reveal fascinating insights into both infection strate¬ 
gies and cyanophage evolution. The publication of the 
genomes of some freshwater cyanophages will extend our 
knowledge in these areas, and, importantly, will clarify 
the evolutionary relationships between freshwater and 
marine cyanophages. 

An accurate assessment of their ecological signifi¬ 
cance will depend on a number of factors. There is still 
no methodology to estimate the true abundance of infec¬ 
tious cyanophage in a particular environment. Our under¬ 
standing of the selection pressure of cyanophages on 
genetic diversity and succession in natural cyanobacte¬ 
rial assemblages, together with their contribution to bio¬ 
geochemical cycles, is very limited. Topics such as the 
significance of lysogeny, pseudolysogeny and horizontal 
gene transfer have scarcely begun to be investigated. In 
short, the study of cyanophages has a long way to go. 
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Marine Phages 

ROBERT V. MILLER 


T wo decades ago marine bacteriophages were unimpor¬ 
tant to microbial ecologists. After all, bacteriophages 
could only be important in certain environments where 
their concentrations were high enough to have an effect on 
bacterial populations. In most microbiologists' minds, this 
limited them to just a few ecosystems such as waste treat¬ 
ment facilities, cheese and yogurt production facilities, and 
the microbiology laboratory. It certainly did not include 
marine environments! Of course, specific phages such as 
PM2 were of interest (18, 87; see also chapter 14) because of 
their cytology and molecular microbiology. The few studies 
that had addressed the frequency of viruses in the oceans, 
however, showed that marine bacteriophages were not 
prevalent enough to affect marine ecosystems. 

Two important developments in the 1980s changed this 
picture and stimulated interest in aquatic bacteriophages. 
The first arose from concerns originating from the newly 
emerging environmental biotechnology industry’s use of 
genetically engineered microorganisms. Many feared that 
the recombinant sequences would escape to naturally 
occurring bacteria by horizontal gene exchange (36). Soon 
it was demonstrated that transduction was a viable mecha¬ 
nism for genetic exchange in aquatic environments (27, 37, 
47,68,70,72,73). Still, phages were considered to be too infre¬ 
quent in the aquatic environment to make this a real possi¬ 
bility. The second development, however, eliminated this 
objection and clearly demonstrated the importance of 
bacteriophages in marine environments. In 1989, Bratbak, 
Heldal and others (5, 8, 11) demonstrated that in many 
aquatic environments bacteriophages were present in very 
high concentrations and that they often exceeded by one to 
two orders of magnitude the concentrations of bacterio- 
plankton that were their hosts. This report was quickly 
followed by several confirmatory publications (42, 64) that 
made it clear bacteriophages are indeed real players in the 
ecology of marine habitats. 

This chapter is designed to provide the reader with an 
up-to-date and informed overview of the current under¬ 
standing of the abundance of bacteriophages in aquatic, 
especially marine, environments. It is meant to demonstrate 


that bacterial viruses are important members of these eco¬ 
systems. It will explore the current controversy over the 
role of bacteriophages in the microbial loops of marine food 
webs (12). More detailed accounts of the subject can be 
found in several recent reviews (10, 20, 63, 79, 84, 100). See 
also chapter 32, which reviews cyanophages, the viruses 
of cyanobacteria. 

The Abundance of Bacteriophages 

in Marine Environments 

Why the Numbers Game? 

Because bacteriophages are intracellular parasites of bac¬ 
teria, they must find and infect a host bacterium in order to 
propagate. As the infective particle is biologically inert, 
virions encounter potential host organisms by simply diffus¬ 
ing through the suspending medium until they collide 
with a bacterium. If this collision results in the phage 
virion becoming attached to a host cell receptor, then infec¬ 
tion follows (41, 76). 

Since these collisions appear to occur at random, adsorp¬ 
tion follows first-order kinetics (30, 76). As a consequence, 
in laboratory studies at least, the kinetics of phage attach¬ 
ment is generally found to be dependent on the concentra¬ 
tion of both bacterial hosts and bacteriophages (32). Since 
most bacterial cells have the capacity to adsorb many bacte¬ 
riophage virions before their receptors are saturated, the 
distribution of infected and uninfected bacteria in a popula¬ 
tion follows a Poisson distribution where the fraction of 
uninfected hosts (B u ) is given by the formula B u = Be~ IPBRI , 
where B is the total concentration of bacteria per cubic centi¬ 
meter and PBR is the phage-to-bacterium ratio found in 
the environment (32, 41). When the PBR is low (<0.1), B u is 
very large, and the number of infected cells is an insignifi¬ 
cant fraction of the total population (41). 

Until the observations of Bratbak, Heldal and others 
(5, 8,11) in 1989 and 1990, it was assumed that the number 
of marine bacteriophages was very low and, therefore, that 
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the PBR in these environments would be < 0.1 (41). Earlier 
studies had depended on enumeration of virions by observa¬ 
tion of the consequences of virulent infection of a specific 
strain of host organism (44), and that led to gross under¬ 
estimations of virion densities. For instance, in 1960 Spencer 
(75) detected four phage strains in North Sea water. 
They ranged in concentration from 0.1 to 10 plaque-forming 
units (PFU)/ml. A 1987 report by Moebus (44) found 
1-3 PFU/ml for phages infecting each of five unidentified 
marine bacterial isolates. The only report of substantial 
numbers of phages was in the brackish Kiel Bight of the 
Baltic Sea where Ahrens (2) found phages of Agrobacterium 
stellatum in numbers as high as 10 4 PFU/ml. In these same 
environments, the numbers of bacteria were found to 
be significantly higher, with cell counts often as high as 
10 4 -10 6 CFU/ml (41). Hence, PBRs in the marine environ¬ 
ment appeared to be very low, and it was reasonable to 
assume that phages could not have a significant influence 
on the make-up or density of bacterial populations, mediate 
gene transfer, or affect the food chain of the ecosystem in 
any significant way (41). 

Current Estimates of Bacterial Virus and 

Virus-like Particles in Marine Environments 

The 1990s brought new techniques including transmis¬ 
sion electron microscopy (TEM; 9) to the isolation and enu¬ 
meration of marine bacteriophages (39, 78), and it soon 
became clear that the true numbers of phage-like particles 
often exceeded 10 8 /ml (41). Combined with other modern 
techniques including new methods of concentration (31, 
58), epifluorescence (22, 51), flow cytometry (35), and pulse- 
field gel electrophoresis (17),TEM (33) allowed a truer picture 
of the high concentration of bacteriophages in aquatic 
environments. (A complete discussion of these techniques 
including their strengths and shortcomings can be found 
in Wommack and Colwell (100).) 

The earliest report using TEM was actually in 1979, 
when Torrella and Morita (85) reported the numbers of 
bacteriophage particles in Yaquina Bay, Oregon to be as 
high as 10 3 particles/ml in areas where low concentrations 
of dissolved organics where observed and as high as 10 4 
particles/ml where concentrations of organic material were 
high. The authors were careful to point out that no enrich¬ 
ments for bacteriophages were made and that they collected 
particles by employing a 0.2 pm pore-size filter. Therefore, 
they suspected that their numbers represented minimum 
estimates of phage concentrations at best. While Torrella 
and Morita demonstrated higher concentrations of marine 
bacteriophages than had been observed in other studies, 
their work may not have made the impact of later studies 
because no comparison with the number of bacterial 
hosts was made (i.e., no PBRs were estimated). 

This was rectified in 1989 when Bergh and cowork- 
ers (5) demonstrated not only high concentrations of 


bacteriophages in several marine environments but that 
the PBRs in these environments were also high (> 1.0). 
Phage concentrations of 2 x 10 8 /ml were reported in envir¬ 
onments that supported bacterial concentrations of only 
6 x 10 6 /ml (PBR = 33). Assuming that this environment 
contained a minimum of 100 different phage-host systems, 
the authors estimated that the rate of phage adsorption 
might be as high as 2.5 particles/min per milliliter (5). 

Bergh et al.’s (5) report was quickly followed by 
others (5, 8, 11, 42, 64, 82) which confirmed their findings 
and demonstrated that bacteriophage concentrations of 
this magnitude were common. Many of these studies 
related the numbers of bacteriophages to the numbers of 
potential hosts in these environments, allowing the cal¬ 
culation of PBRs for those environments (table 33.1). 
Phages could no longer be neglected in describing aquatic 
environment. 

Suttle et al. (82) used TEM to show that natural marine 
water contains between 10 6 and 10 9 virus particles per milli¬ 
liter. They demonstrated that these particles included 
viruses that infect a variety of microorganisms, in addition 
to bacteria, including diatoms, cryptophytes, prasinophytes, 
and cyanobacteria and that viral particle concentrations 
were higher in estuarine waters than in the open ocean. 
Thus, marine viruses became known as important effectors 
of populations of all types of microorganisms in marine 
environments. 

Paul et al. (59, 60) studied bacteriophage concentra¬ 
tions at Key Largo, Florida (659), and at Mamala Bay, 
Oahu, Hawaii (60). In both environments they found that 
bacteriophage counts were highest in eutrophic areas 
(approximately 10 7 /ml) and declined to around 10 6 particles/ 
ml in offshore waters. Interestingly, the concentration of 
phages was inversely proportional to water salinity (59). 

Wichels et al. (95) investigated the diversity of phages 
in a collection of 85 phages isolated from the North Sea. 
They found that the majority were members of the family 
Myoviridae, but members of the families Siphoviridae and 
Podoviridae were also in the collection. They determined 
that the phages belonged to 13 different species and that 
they all infected Gram-negative, facultatively anaerobic, 
motile, coccoid hosts that appeared to belong to the y sub¬ 
division of the Proteobacteria. Thus, the potential of viruses 
to control bacterial diversity as well as numbers in natural 
marine environments becomes apparent. 

Spatial and Temporal Variation in 

Bacteriophage Concentrations and PBRs 

The size of bacteriophage populations has often been 
observed to vary seasonally. Bergh et al.’s (5) 1989 paper was 
the first to observe seasonal variation in the number of 
bacteriophages. Several orders of magnitude separated 
high summer frequencies from low winter numbers. In 
a subsequent report, Bratbak et al. (11) found that the 
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Table 33-1 Phage-to-Bacterium Ratios (PBR) and Concentrations of Virus-Like Particles in 
Various Marine Habitats 


Marine environment 

Virus-like particles (x10 6 /ml) 

PBR 

Reference 

Estuarine 

Chesapeake Bay, USA 

3-140 

3-26 

101 

Chesapeake Bay, USA 

10 

3 

5 

Lake Saelenvannet, Norway 

20-300 

20-80 

89 

Tampa Bay, USA 

5-16 

0.4-9 

15 

Tampa Bay, USA 

5-20 

0.9-9 

25 

Cosatal ocean 

Arctic Ocean, Resolute, Canada (sea ice) 

9-430 

10-72 

34 

Arctic Ocean, Resolute, Canada (seawater) 

1.1 

10-20 

34 

Bering and Chukchi Seas 

2.5-36 

5-5 

77 

North Adriatic Sea 

9-130 

5-25 

93 

Pacific Ocean, Japan 

1-40 

2-18 

22 

Paradise Harbor, Antarctica 

0.2-1.3 

0.7-6 

7 

Raunefjorden, Norway 

0.01-10 

<1-36 

5 

Southern California Bight, Santa Monica, USA 

18 

14 

51 

Open ocean 

Barents Sea 

0.06 

3 

5 

North Atlantic 

14 

50 

5 

North Pacific (subarctic) 

0.06-0.4 

1-4.5 

23 

North Pacific (subtropical) 

0.4-2 

1-9 

23 


concentration of phages rose from 5 x 10 5 /ml in early spring 
to a maximum of more than 10 7 /ml during a late summer- 
fall diatom bloom in a Norwegian fjord. Wommack et al. 
(101) found that the numbers of bacteriophages in Chesa¬ 
peake Bay showed a peak from August to October with 
phage counts ranging from 10 6 to 10 8 particles/ml. Virus 
counts were always at least 3 times greater than bacterial 
counts and PBRs as high as 25 were reported at some 
samplings. 

Weinbauer et al. (93) studied diel, seasonal, and depth- 
related variation in viral concentrations in the Northern 
Adriatic Sea. They found that during periods of water strati¬ 
fication, the highest numbers of phages and the highest 
PBR ratios were found at the thermocline. They speculated 
that this was due to the higher microbial biomass found 
there. In their studies, phage concentrations showed a 
strong seasonal variation, with phage abundance greatest 
in the fall (10 8 -10 9 particles/ml) and lower in the winter 
(<10 7 /ml). PBRs averaged 15 throughout the year, clearly 
demonstrating the potential of viruses to be key players in 
the ecology of the Adriatic. 

Seasonal changes in the frequency of lysogenized 
bacteria were observed by Cochran and Paul (15). They 
found that samples that displayed prophage induction 
were plentiful during the warmer months, but no induc¬ 
tion was observed in November, December, and January. 
Frequencies of inducible lysogens in the bacterial popula¬ 
tion ranged from undetectable in the winter months to as 
high as 37% in October, although the average varied 
around 10%. 


Lytic Infection, Lysogeny, and 
Pseudolysogeny: Alternate Life-styles for 
the Wet and Small 

Several studies have demonstrated the short infective life 
of bacteriophage virions in aquatic environments (73,100). 
Decay rates of 5-30% per hour are not uncommon. This 
has led several investigators to explore the ways in which 
viruses maintain themselves in these environments. 
Aquatic environments are characterized by slow-growing 
bacterial hosts maintained at relative low concentrations 
of <10 5 /ml (10, 11. 84). These are the very conditions that 
favor a temperate life-style (42, 100) and there have been 
many temperate bacteriophages (including temperate 
cyanophages; 54) isolated from marine environments (4, 29, 
43, 48, 54, 56, 66, 99). In fact, Ackerman and DuBow (1), 
following a thorough investigation of the literature, esti¬ 
mated that, while the frequency of lysogeny varies among 
different bacterial taxonomic groups, between 21% and 
60% of environmental bacteria are lysogens. 

Since free virions of temperate and lytic bacteriophages 
cannot be distinguished via microscopy, investigators 
have turned to a variety of chemical and physical treatments 
including in situ hybridization (52, 53) to identify environ¬ 
mental lysogens. Jiang and Paul (26) used mitomycin C 
induction to estimate that 38% of the bacterial population 
in estuarine environments is lysogenized. Detectable lyso¬ 
gens were less frequent offshore. 

In a later study, Jiang and Paul (28) explored the fre¬ 
quency of lysogens among 116 bacterial isolates from 
various marine environments. More than 40% of the 
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strains contained inducible prophage. A higher percentage 
of lysogenized bacteria was found among isolates from 
oligotrophic environments than from coastal or estuarine 
waters. These observations are consistent with the assump¬ 
tion that lysogeny will be a more important life-style among 
bacteriophages in environments where hosts are few and 
energy is limited. 

Weinbauer and Suttle (91, 92) found the frequency of 
inducible lysogens to be low (<5%) in bacterial popula¬ 
tions in the Gulf of Mexico. The frequency was higher in 
offshore (2-11%) populations than it was in coastal water 
(1-2%). While the total numbers of inducible lysogens were 
lower than in some other studies, these data are also consis¬ 
tent with the assumption that lysogeny is more important 
in oligotrophic environments. 

In addition to lytic and temperate growth, there is mount¬ 
ing evidence for an intermediate state, pseudolysogeny, of 
bacterium-bacteriophage interaction in natural environ¬ 
ments. Simply stated, pseudolysogeny is a phage carrier 
state resembling lysogeny in that after infection bacterio¬ 
phage can either enter a cryptic, intercellular state or 
sustain rapid lytic infection. However, unlike true lysogeny, 
pseudolysogeny does not involve integration of host and 
phage genomic DNA and the phage DNA is neither replicated 
nor segregated equally into all progeny cells (1,69). 

Many reports of pseudolysogeny (45, 46, 52, 67, 97) or of 
phage-host systems with characteristics of pseudolyso¬ 
geny (99) in aquatic environments have appeared in the 
literature (for a review see 40). The bacteriophage Hs 1 of 
Halobacterium salinarium serves to illustrate many of the 
characteristics of pseudolysogeny and was one of the first 
marine archael systems to be studied (67, 86). In culture, 
infections of Hs 1 were characterized by sporadic lysis of 
major portions of infected batch cultures, yet it proved 
impossible to subculture stable lysogenic clones from the 
survivors. The interaction between host and phage changed 
with the concentration of salt in the medium. At 17.5% 
[wt/vol] NaCl (the lower limit for host survival) phage infec¬ 
tion appeared to be virulent. However, as the concentration 
of salt approached 30%, the majority of infections led to 
nonproductive phage carrier states (pseudolysogens). Such 
conditions favor phage survival in environments where 
host growth is not favored (86). These observations demon¬ 
strate an essential characteristic of pseudolysogeny: phage 
production is regulated by environmental conditions that 
dictate host growth and survival (69,100). 

Following careful scrutiny of numerous marine phage- 
host relationships, Moebus (44, 45) has concluded that 
pseudolysogeny is a common phenomenon in marine eco¬ 
systems. Ackerman and DuBow (1) extended this general¬ 
ization to include bacteriophage-host systems in general. 
Wommack and Colwell (100) suggest that pseudolysogeny 
affords “phage populations a means of quickly reacting to 
environmental changes.” Ripp and Miller (40, 69) suggest 
that it affords environmental phages a mechanism with 


which to extend their infective half-life in environments 
with limited nutrients and capacity for viral growth until 
nutrients are available. Thus, Wommack and Colwell (100) 
believe that influxes of limited amounts of nutrients into 
a nutrient-limited marine ecosystem can stimulate both 
bacterial production and bacterial mortality through lytic 
activation of preprophage (the carrier state of the viral 
genome within the pseudolysogenized host envisioned 
by Ripp and Miller: 69). 

Factors Affecting Bacteriophage 
Concentration 

We have already seen that many environmental factors 
such as time of year (5, 15, 93, 100, 101), concentration 
of dissolved organics (59, 60, 82, 100), and salinity (59, 62, 
100, 104) influence bacteriophage abundance and PBRs 
in marine environments. Other studies have explored 
the effects of other environmental components on phage 
populations. 

Babich and Stotzky (3) investigated the differential toxic- 
ities of mercury to bacteria and bacteriophages in aquatic 
environments. They found that chloride salts of mercury 
were less toxic that was metallic mercury. This correlated 
with the fact that mercury was not as toxic to bacteria 
and bacteriophages in high-chlorine-containing marine 
waters than it was to bacteria and phages in fresh water. 

A primary effector of bacteriophage virion stability in 
marine environments is solar ultraviolet (UV) light. Noble 
and Fuhrman (50) found that decay rates of several viruses 
in coastal waters were almost twice as fast in full sunlight 
as in the dark. Wommack et al. (102) tested decay rates of 
two Aeromonas phages. They incubated microcosms con¬ 
taining their phage-host test systems in Chesapeake Bay 
and in the York River estuary. Three types of microcosms 
were used. The first group was surface-incubated and 
unshielded (full sunlight). The second group was surface- 
incubated but covered (dark). The third group was 
unshielded but incubated at a depth of 1 m. Decay rates of 
phage infectivity were double in the first group compared 
with either of the other groups. These latter two groups 
showed decay rates equal to values obtained in the labora¬ 
tory. However, the rates of destruction of phage particles 
were similar under all conditions and slower than any of 
the decay rates. Destruction of particles appears to be a 
process separate from loss of infectivity. This observation 
sounds a cautionary note for those studies that have used 
only TEM observation to determine the number of phages 
active in marine environments. 

Suttle’s group, currently at the University of British 
Columbia, has carried out an in-depth study to determine 
the effects of UV damage and repair on marine bacterio¬ 
phages (80, 81, 94). They found that not only were 
UVB (290-320 nm) wavelengths damaging, but UVA 



538 PART V: PHAGES BY HOST OR HABITAT 


(320-300 nm) and photosynthetic light (400-700 nm) 
could also reduce phage infectivity (80). They studied 
host-associated repair of Vibrio natriegens bacteriophage 
conducted in offshore, coastal, and estuarine waters (94). 
In these experiments, light-dependent repair (probably 
photoreactivation using 370-550 nm light) compensated 
for a large fraction of the sunlight-induced damage to 
phage DNA (94). Photoreactivation appears to be essential 
to maintaining high concentrations of viable viruses in 
surface marine waters. 


Control of Bacterial Population 
Densities and Implications for the 
Marine Food Web 

Phage-Induced Mortality 

In one of the earliest studies using TEM, Proctor and 
Fuhrman (64) demonstrated that a significant fraction of 
bacterial mortality in the ocean was due to viral infection. 
They observed that up to 7% of the heterotrophic bacteria 
and 5% of the cyanobacteria from diverse marine loca¬ 
tions contained mature phages. These data suggested that 
up to 70% of the prokaryotes in these environments were 
infected with phages at any given time and that up to 30% 
of cyanobacterial and 60% of heterotrophic bacterial 
mortality was due to bacteriophage infection. 

In a later, more detailed study, Proctor and Fuhrman (65) 
modified their expectations somewhat, but still estimated 
that mortality due to bacteriophage-induced lysis ranged 
from a low of 3% to a high of 62% for free-living bacte¬ 
ria and 52% for particle-associated cells. These studies 
warned that bacteriophage infection exerts a significant 
influence on carbon and nitrogen cycling in marine food 
webs. 

Weinbauer and Suttle (91, 92) found that even in popu¬ 
lations that contained low frequencies of inducible pro¬ 
phages (2-11% of the population), as much as 5% of the 
total bacterial mortality could be accounted for by induc¬ 
tion of lysogens to lytic growth. Cochran et al. (16) demon¬ 
strated that many environmentally important pollutants, in 
addition to UV light, can act as inducing agents for natural 
lysogens in the Gulf of Mexico. 

Miller (38) prepared microcosms of sterilized water from 
the Gulf of Mexico and inoculated them with a lysogen of 
Vibrio parahaemolyticus. These microcosms were incubated 
for a 3 day period on the deck of a research vessel during 
a research cruise of the Gulf. One half of the microcosms 
were exposed to natural solar radiation (containing UV 
light); the other half were covered to protect them from 
solar radiation. Results indicated that mortality of the lyso¬ 
genic bacteria was dramatically increased in microcosms 
exposed to sunlight. Likewise, the number of virions isolated 
from the microcosms was much higher in sunlight-exposed 


chambers. In this time of increased solar UV exposure, 
due to ozone thinning, increased bacterial mortality due to 
induction of naturally occurring lysogens must be con¬ 
sidered in estimating the effects of stratospheric ozone 
depletion on marine food webs. 

Tuomi et al. (88) found that increased availability of 
carbon and energy in a seawater microbial community 
stimulated virus production to a greater extent than it stimu¬ 
lated bacterial biomass production. They found that signifi¬ 
cant increases in the PBR were brought about by these 
conditions. Although not investigated by these authors, 
these data suggest that the addition of nutrients to the 
environment leads to the activation of lysogens or pseudoly- 
sogens to the production of phage virions. These data are 
consistent with the finding of Ripp and Miller (69) that 
addition of an energy source to a freshwater microcosm led 
pseudolysogens of Pseudomonas aeruginosa starved for 15 
days to lyse, releasing phage particles. It is well established 
that nutrient status influences the decision between lytic 
and temperate growth upon primary infection (98). The 
data of Tuomi et al. (88) suggest that a close link between 
nutrient concentration and initiation of phage production 
also exists in aquatic environments. 


Control Microbial Population Size 

The realization that bacteriophage concentrations were 
high enough to affect bacterial population size and rates 
of mortality turned the accepted dogma concerning food 
webs in the ocean upside down. Significant bacterial lysis 
by phages would alter the amount of dissolved carbon and 
the movement of carbon and energy through the food 
chain from bacteria to grazers and upwards to higher organ¬ 
isms. In addition to bacteriophages, the concentrations 
of viruses of other microorganisms, including primary 
producers (algae and cyanobacteria), were high enough to 
alter the accepted view of the contributions of algae and 
cyanobacteria to the movement of carbon through the 
marine food web. 

Wommack and Colwell (100) developed a conceptual 
model of the microbial loop of the marine food web that 
includes viruses and viral lysis (figure 33-1). Similar ideas 
were voiced earlier by Fuhrman (19). These models demon¬ 
strate that bacteriophage activity enhances the flux of 
bacterial biomass into the dissolved organic material (DOM) 
pool. In addition, viral lysis of photosynthetic algae and 
cyanobacteria is found to augment the flux of photosyn- 
thetically fixed carbon into the DOM pool. Thus, the effect 
of viral lysis is to divert carbon away from mesozooplank- 
ton consumers (grazers) and into the DOM pool (19, 96, 
100). Viral lysis is extremely efficient in moving biomass into 
the DOM pool because essentially all of the cell’s contents 
move directly from the bacterial cell into the DOM pool 
(19,100). 
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Figure 33-1 Microbial viruses and the microbial loop. A schematic diagram highlighting the potential role of virus infection 
and virus-induced cell lysis in the production of DOM in aguatic ecosystems. Reproduced with permission from Wommack 
and Colwell (100). 


Bratbak et al. (11) hypothesized that phage lysis of bacter- 
ioplankton was a source of nutrient-rich growth substrate 
for bacterial populations. By allowing rapid recycling of 
carbon between bacterial biomass and DOM, phage infec¬ 
tion allows bacterial populations to be sustained at higher 
levels than would be possible if no bacteriophage acti¬ 
vity were present. This predicts that the results of active 
bacteriophage lysis are (i) increases in bacterial biomass 
and (ii) less transfer of organic matter to higher trophic 
levels. 

Fuhrman (19) compared two hypothetical marine food 
webs. The first allowed for viral lysis of host cells; the 
second did not. If viral lysis was kept at a level of 50% of 
total bacterial mortality in the first system, there was a 27% 
increase in bacterial production rates and a 37% decrease 
in export of DOM to nanozooplankton grazers compared 
with the web where no bacteriophage lysis was allowed. 
This resulted in a net loss of 25% in nanozooplankton 
production. A later expansion of the model by Fuhrman 
and Suttle (21) included both viral infection of phytoplank¬ 
ton and loss of viroplankton to consumption by nanozoo¬ 
plankton. Even in this model, similar increases in bacterial 
production and reductions in nanozooplankton production 
were predicted. 


Murray and Eldridge (49) carried out intensive theoret¬ 
ical exploration of the impact of viruses on aquatic micro¬ 
bial food webs. They allowed growth efficiency, recycling 
efficiency, and virus-induced mortality to vary under both 
oligotrophic and mesotrophic nutrient regimes and found 
that bacteriophage infection had the greatest impact in 
oligotrophic environments where recycling of organic 
matter predominates. Smaller effects were predicted in 
mesotrophic environments. 

Bacteriophage Effects on Microbial 
Diveristy and Evolution 

Following a careful analysis of all available literature, 
Wommack and Colwell concluded that approximately “20% 
or less of bacterioplankton and phytoplankton mortality is 
attributable to viral infection. Thus, viruses play a modest 
[but appreciable] role in controlling the population densities 
of bacteria” (100). Phages may have an even greater influ¬ 
ence on species diversity for two reasons. First, bacterio¬ 
phages can increase diversity by, as Thingstad (83) has 
speculated, “killing the winner” and allowing species 
to survive that would otherwise be eliminated by the 
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dominant species. Second, they can increase diversity at 
the genetic level by mediating genetic exchange among 
bacterial strains and even species via transduction (68,100). 

Species Diversity 

Recently, Wommack and Colwell (100) modeled bacterio¬ 
phage influence on host community diversity. Their model 
predicted blooms of host bacteria due to changes in nutri¬ 
ents or environmental factors followed by killing of hosts 
and short-lived blooms of bacteriophages. Such blooms 
have been observed in situ (5, 55, 89,105). 

Hennes et al. (24) monitored population densities of 
the Vibrio natriegens strain PWH3a and its fluorescently 
labeled phage PWH3a-Pl following their introduction 
into a seawater microcosm. PWH3a rapidly increased to 
40% of the population and then declined to less than 2%. 
The titers of PWH3a-Pl phage, on the other hand, rose 
throughout the incubation period from undetectable levels 
to 10 8 /ml (about 70% of the total viral population). These 
authors concluded that viruses present at low abundances 
in natural aquatic viral communities can control micro¬ 
bial community structure. 

In a study conducted in Chesapeake Bay, Wommack 
et al. (103) examined the presence of specific viruses using 
gene probes. They determined that titers of single viruses 
were highly localized within the Bay and changed over 
time with peaks and declines. The dynamics observed 
were consistent with models predicting bacteriophage 
control of blooms of single host strains (24,100) that reduce 
their abundance and allow continued diversity in the 
population. 

Thingstad (83) suggests that viral killing of dominant 
species ensures the coexistence of competing bacterial 
species. Bacteriophages act as a balancing force that allows 
bacterial species with different growth rates to coexist in 
a steady state due to the establishment of a “hierarchical 
theory of top-down control of diversity” (83). The limiting 
nutrient element that sets total biomass in the food web is 
at the top of this hierarchy. The second level is size-selective 
predation by grazers etc. that allows for distribution of 
the limiting element into different functional groups of the 
web. Finally, bacteriophage-host specificity allows further 
nutrient distribution into different species within a function 
group (83). 

Genetic Diversity 

Besides controlling diversity by regulating sizes of host 
populations, bacteriophages can increase diversity in 
bacterial gene pools through horizontal gene transfer by 
transduction. For instance, the acquisition of virulence 
determinants by disease-causing Vibrio cholerae is asso¬ 
ciated with transduction and lysogenic conversion (6, 13, 
90). Until the realization that there are high concentrations 


of bacteriophages in the aquatic environment, transduction 
was dismissed as a viable gene transfer system for several 
reasons. First, most bacteriophages are restricted to infec¬ 
tion of a narrow range of bacterial hosts (41), and transduc¬ 
tion is by its nature a reductive process that kills the 
genetic donor in the process of producing transducing 
particles (36). However, on the positive side, transduct¬ 
ion may be favorable to conjugation and transformation 
because the gene-transfer elements (transducing particles) 
are relatively long lived and transduction does not require 
cell-to-cell contact (62, 74). When the true concentra¬ 
tions of bacteriophage particles in the aquatic environ¬ 
ment were realized, transduction suddenly became a viable 
option for gene transfer in these environments (37). 

Transduction was demonstrated in the marine environ¬ 
ment by Jiang and Paul (27) who established a model trans¬ 
duction system using marine phage-host isolates that 
complemented long-studied freshwater models (37, 38). 
They isolated a bacterium, DIB, from Mamala Bay, Hawaii 
that was identified as a F lavobacterium sp. (27), and a 
temperate phage, T-c|)l)l B, that infected DIB. Transduction 
experiments using this phage-host system were carried 
out in microcosms in the Tampa Bay Estuary, and trans¬ 
duction rates as high as 10 7 transductants/PFU were 
observed for the transfer of the Tra~ plasmid pOSR50 
(Kan r , Str r ). Rates of 4 x 10 - were documented in micro¬ 
cosms containing mixed bacterial communities from the 
Bay. Even though these rates are low, they would produce as 
many as 10 14 transduction events each year in a marine 
environment the size of Tampa Bay (27). 

Chiura (14) used virus-like particles released from iso¬ 
lates of various marine bacteria to transduce auxotrophic 
strains of Escherichia coli and Bacillus subtilis to proto¬ 
trophy. They obtained transduction frequencies as high as 
10~ /particle. These studies demonstrated that naturally 
occurring virus-like particles could act as transducing 
agents. 

The studies described above and a number of others 
have demonstrated the real potential for transduction in 
aquatic environments and several reviews on the subject 
have appeared (34, 36, 41, 57, 71). Clearly, transduction can 
no longer be dismissed as a phenomenon restricted to 
the microbiology laboratory and must be considered in 
assessing the evolutionary potential of bacteria in aquatic 
environments. 

Conclusions 

It is now well established that bacteriophages are an active 
and dynamic factor in marine ecosystems. Bacteriophages 
must be considered in any investigation of the importance 
of microorganisms to the ecology of all habitats found on 
our planet. Bacteriophages must be included in any 
models of natural food webs, and they must be considered 
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in assessing the impact of pollutants and global environ¬ 
mental changes on the biosphere of the Earth. 
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T he genus Yersinia contains 11 species, of which Yersinia 
enterocolitica, Yersinia pestis, and Yersinia pseudotubercu¬ 
losis are pathogenic for humans and animals (5). While 
Y. pestis is the causative agent of plague, Y. enterocolitica and 
Y pseudotuberculosis are mainly enteropathogenic. In com¬ 
parison with the other pathogenic Yersinia species, Y. entero¬ 
colitica is very heterogeneous, comprising pathogenic 
and nonpathogenic strains which belong to about 70 sero- 
groups. In order to differentiate Yersinia strains, a number of 
phages were isolated and used for typing. Several phage 
typing sets have been worked out for Y enterocolitica (4, 30, 
38). The sets contain temperate phages isolated from 
Yersinia strains (37, 39) or phages isolated from sewage (2, 8), 
whose origin is unknown. Three yersiniophages have 
been sequenced: phages <f>A1122, 4>Ye03-12, and PY54. This 
chapter is primarily an overview of the properties of those 
three phages. 


Overview of Yersiniophage 

Characteristics 

Host Range 

Temperate Y. enterocolitica phages can be easily isolated 
since high frequencies (up to 86.4%) of lysogeny are 
observed (36, 58). In contrast, much lower frequencies have 
been determined for nonpathogenic Yersinia species (9) and 
there are only a few reports about the isolation of temperate 
phages from Y pestis and Y pseudotuberculosis (33, 40). Most 
of the temperate Y. enterocolitica phages showed a narrow 
host range and were not able to infect other Yersinia spe¬ 
cies, nor other Enterobacteriaceae, nor Gram-positive bacte¬ 
ria (23, 39). However, phages isolated from nonpathogenic 
Yersinia species are often active on pathogenic Y. enterocoli¬ 
tica strains (10, 12). Other than host ranges, derived from 
phage-typing experience, there is only scarce information 
available on Yersinia phages. 


Morphologies 

Kasatiya and Ackermann (23) studied the morphology of 
nine Y. enterocolitica phages and found four morphotypes. 
The phages had isometric or elongated heads and contrac¬ 
tile or very short tails. Phages with isometric heads and 
contractile tails were also described by Kawaoka et al. (27). 
In a more comprehensive study, eight temperate yersinio¬ 
phages were characterized on the basis of their morphology, 
host range, genome size, DNA homology, and protein 
composition (45). The phages contain double-stranded 
DNA with genome sizes of 40-60 kb. The morphology of 
the phages suggested that they belong to different fami¬ 
lies. However, DNA homologies were observed between 
members of the families Myoviridae and Podoviridae. 

Temperature Dependence 

Several authors noted that temperate yersiniophages are 
not able to be lytic at 37 °C (4, 39). Studies of growth- 
temperature-dependent variation of lipopolysaccharides 
indicated that the receptor for a number of yersiniophages 
is a glycoconjugate other than lipopolysaccharicharid, one 
that is synthesized at 25 °C but not at 37 °C (28,29). Contrary 
to this report, Calvo et al. (11) showed that the receptor for 
a Y. enterocolitica 0:3-specific phage was present at 37 °C 
and concluded that the phage was able to adsorb but not to 
replicate at this temperature. 

Transduction 

One of the phages in the study by Popp et al. (45), phage PY20 
isolated from a pathogenic Y. enterocolitica 0:3 strain, was 
used for transduction experiments. PY20 was able to trans¬ 
duce small Yersinia plasmids (4.3 and 5.8 kb) but not the 
70 kb Yersinia virulence plasmid, pYV The same result was 
obtained with phage mixtures isolated from sewage that 
had been tested for their ability to infect Yersinia strains 
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(18). Obviously these phages did not have the packaging 
capacity to encapsidate the relatively large pYV plasmid. 
In contrast, phage PI was shown to transfer the pYV Yersinia 
virulence plasmid of Y. pestis to Y. pseudotuberculosis (59). 

Sequenced Yersiniophages 

Up to now the genomes of three yersiniophages (cf>A1122, 
4>Ye03-12, and PY54) have been sequenced. The phages 
4>Ye03-12 and PY54 are discussed in the following 
sections. 


Yersiniophage 4>A1122 

Overview 

Phage (j)A1122 is a virulent phage ofY pestis which at 37 °C 
is also active on Y. pseudotuberculosis (15). The c|)A1122 
genome consists of 37,555 bp including direct terminal 
repeats of 148 bp. Fifty one gene products have been 
predicted. 

T3/T7-Like Phages 

(j)A1122 reveals a strong relationship to the sequenced 
yersiniophage, ())Ye03-12 (see below), and to coliphages 
T7 and T3 (see Chapter 20). Aside from some genes that 
have no positional counterpart in these other phages, the 
(j)A1122 genome is 89% and 73% identical to the genomes 
of T7 and T3, respectively. Moreover, the strong relation¬ 
ship is corroborated by similar or identical promoters, tran¬ 
scriptional terminators, terminal repeats, and the origins of 
replication. At the protein level, most 4>A1122 proteins 
exhibit more than 80% identity with the homologous T7 
proteins. Three and four <j>A1122 proteins are even identical 
to their T7 and T3 counterparts, respectively. Interestingly, 
almost one quarter of the <j>A1122 genome (genes 15 to 19), 
coding for about half of the morphogenic functions, is 
99.8% identical to T3. The G-C content of the corresponding 
T3 region is significantly less than that of the remainder 
of the T3 genome and much closer to that of the c|)A1122 
genome. As the first 26 kb of the T 3 genome, up to gene 
15, is approximately 90% identical to the Y. enterocolitica 
phage (|)YeO 3-12, it has been suggested that phage T3 
might have been arisen by recombination of two yersinio¬ 
phages (15). 

Host Range Divergence 

Although phage T3 does not infect Yersinia, the gpl7 tail 
fiber proteins of T3 and c|)A1122, determining the host 
range of these phages, are nearly identical. It could be 
demonstrated that besides Yersinia, phage c()A1122 is able 
to plate on Escherichia coli K12, albeit with a much lower 


efficiency. The phages released from E. coli showed an 
expanded host range infecting at the same efficiency both 
Y. pestis and E. coli. Sequence analysis revealed a single 
mutation in gene 17 of phage c|)A1122. This mutation leads 
to an amino acid exchange at position 523 near the C- 
terminus of the protein, which interacts with the bacterial 
cell surface. The same amino acid is present at the corre¬ 
sponding position in the tail fiber protein of T3. These 
data indicate that despite the fact that T3 is a specific 
phage of E. coli, while (j)A1122 mainly infects Y. pestis, it 
seems very likely that one or even two parents of T3 are 
yersiniophages. 


Yersiniophage c|)Ye03-12 

Isolation of the Phage 

The Y. enterocolitica serotype 0:3 (Ye03)-specific bacterio¬ 
phage, <|)Ye03-12, was isolated in 1988 from the raw incom¬ 
ing sewage of the Turku City sewage treatment plant (54). 
A phage of identical characteristics was isolated from 
the sewage in 2000, indicating that the phage is stably 
colonizing the sewage system (M. Skurnik, unpublished 
observations). 

Phage Characteristics 

In electron micrographs, the c|)Ye03-12 particles have 
approximate dimensions of 57 nm for the head and 
15 x 8 nm for the tail. Some capsids show pentagonal 
outlines, indicating their icosahedral nature. This morphol¬ 
ogy classifies c|)Ye03-12 to the family Podoviridae (35) and 
in Bradley’s classification to type C (6) (see chapter 2 for 
a review of phage classification). When growing on Y. entero¬ 
colitica strain Ye03, c|)Ye03-12 has eclipse and latent 
periods of 15 and 25 min, respectively, followed by a short 
rise period of 10 min. The resulting lysis of the host cell 
produces a burst size of 100-140 phage progeny (41) 
(for an overview of phage infection-timing and burst-size 
characters see chapter 5). 


Phage Receptor 

The lipopolysaccharide (LPS) O-antigen (a homopolymer 
of 6-deoxy-L-altrose) of Ye03 is the phage receptor since 

(i) the phage-resistant YeO 3 strains lack the O-antigen and 

(ii) the phage is able to infect E. coli strains expressing 
the cloned O-antigen of YeO 3 (1). In addition to serotype 
03, phage c()Ye03-12 infects other Y. enterocolitica serotypes 
(such as 0:1 and 0:2) carrying 6-deoxy-L-altrose in their 
O-antigen. Also, Y. frederiksenii serotype 0:3 and Y. mollaretii 
serotype 0:3 were found to be phage-sensitive (41). 
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Nucleotide Sequence of c()Ye03-1 2 (42) 

The linear DNA of (|)Ye03-12 is 39,600 bp in size (GenBank/ 
EMBL/DDBJ database accession No. AJ251805). The genome 
has an overall G-C content of 50.6%, which is close to that 
of its host, 48.5 ±1.5% (7). Altogether 58 genes were identi¬ 
fied from the sequence (figure 34-1) (42). The nucleotide se¬ 
quence shows striking similarity to that of phage T3 with an 
overall sequence identity of 84%. It is also approximately 
70% identical to that of phage T7, and therefore is also 
closely related to the Y. pestis phage c|)A1122 (discussed 
above). Most of the 58 genes have counterparts in the 
genomes of phages c))A1122, T3, and T7, and the genes 
are organized in the same order. The nucleotide sequences 
of phages (j>Ye03-12, T3, and T7 were compared and the 
results plotted using the DOTPLOT program of GCG 
(figure 34-2), showing that the three genomes align over 
their entire lengths. In a few places the identity lines are 
broken by upward or downward shifts caused by the 
presence or absence of a gene in the compared genomes. 
The high similarity between the c[)Ye03-12 andT3 genomes 
shows that they belong to the same lineage of phages that 
have developed different host specificities. 


Promoters 

The 4>YeO 3-12-specific promoter is identical to that of 
phage T3 and consists of a 23 bp sequence (AATTAACCCT- 
CACTAAAGGGAGA) that runs from —17 to + 6 relative to 
the transcription start. (|)Ye03-12 has a total of 15 promoter 
motifs located in the genome in the same relative positions 
as theT3 promoters (figure 34-1). 

Terminal Repeats 

The genome contains identical direct terminal repeats of 
232 bp that are 87% identical to the 230 bp terminal 
repeats of T3 and 56% identical to the 160 bp repeats of T7. 
The sequences at the beginning and end of the terminal 
repeats of the three phages are more similar than the 
sequences in the middle, indicating that the mechanisms 
of maturation of the DNA ends might be similar. 

The c|)Ye03-12 Genes 

Similar to the T3 and T7 genomes, cj>Ye03-12 genes can 
be divided into three classes: I (early), II (middle), and III 
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Figure 34-2 Pairwise genomic comparisons. Comparison of phage (/>Ye03-12 with the phage T3 (on the left) and phage 
T7 (on the right). Pairwise comparisons were made using the COMPARE and DOTPLOT programs. Sequences were 
scanned using a 50 nucleotide window and each data point in the plot represents 80% identity or more between the 
sequences. Regions with gross sequence differences between genomes are circled with dotted lines and the gene 
insertions are indicated (Y-nn and T-nn indicate the presence of (/>Ye03-12 phage-specific and T7 phage-specific genes, 
respectively). 


(late). For most of the 4>Ye03-12 genes the functions are 
predicted based on similarity to T3 and T7 genes. In a 
random transposon mutagenesis approach, insertions into 
genes 0.45,0.7,1.1,1.3,1.6, 3.5, 3.7,4.5, and 5.5 were tolerated, 
indicating that their function is not absolutely essential 
for the viability of the phage (31). No insertions into the 
late genes were recovered. 

Early Genes (Genes 0.3-1.45) 

The gene products (gp) of these genes are involved in con¬ 
trolling the functions of the host and redirecting its metab¬ 
olism to benefit phage propagation. Functions of the 
gene 0.45, 0.6A, 0.6B, 1.05, and 1.1 products are not known. 
Gp0.3 is S-adenosyl-L-methionine hydrolase, which degrades 
the methyl group donor in the host (56). In agreement 
with this, no modified nucleotides were detected in 4>YeO 3- 
12 DNA by high-performance liquid chromatography analy¬ 
sis (unpublished). Gp().7 is a protein kinase; gpl, the RNA 
polymerase; gpl.2, a deoxyguanosine triphosphohydro- 
lase inhibitor; and gpl.3, DNA ligase. Gpl.45 shows similar¬ 
ity to HNH endonucleases. 


Middle Genes (Genes 1.5-6.3) 

The middle gene products are involved in phage DNA repli¬ 
cation. Functions of the gene 1.5,1.6,1.7,1.8,4.15,4.2,4.3,4.5, 
5.3, 5.5, 5.7, 5.9, 6.1, and 6.3 products are not known. Gp2 is 
an inhibitor of the host RNA polymerase; gp2.5, a single- 
stranded DNA binding protein; gp3, an endonuclease; 
gp3.5, a lysozyme; gp3.7, a protein similar to a number 
of trypsin inhibitors (3); gp4A, a DNA primase-helicase; 
gp4B, a primase; gp5A, a DNA polymerase; and gp6, an 
exonuclease. 


Late Genes (Genes 6.5-19.5) 

The late gene products are either structural components 
of the phage particle or are involved in packaging the 
genome. Functions of the gene 6.5, 19.2, 19.3, and 19.5 
products are not known. Gp6.7 and gp7.3 are involved in 
adsorption of phage particles to the host and determining 
the host range, respectively. Gp8 is a head-tail connector; 
gp9, scaffolding protein; gplOA, major capsid protein; and 
gplOB, minor capsid protein. Gpll and gpl2 are tubular tail 
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proteins, A and B, respectively. Gpl3, gpl4, gpl5, and gpl6 
are internal phage head proteins, and gpl7, the tail 
fiber protein. Gpl3.5, which is specific for phage <j)YeO 3-12, 
may be an endonuclease. Gpl7.5 is a holin (holins are 
reviewed in chapter 10). Gpl8 and gpl9 are the small and 
large subunits of the DNA packaging protein, respectively. 
Gpl8.5 and 18.7 are involved in host cell lysis. 

Host Recognition 

The phage tail fibers are required to confer infectivity of 
the phage particles. T3 tail fiber is a very stable trimer of 
gpl7 and the trimer formation proceeds independently 
from its attachment to the tail (24, 25). Orientation of gpl7 
in the tail fiber is such that the N-terminus is proximal to 
the phage particle with subunits oriented coaxially to each 
other (26, 57). The counterpart(s) to which the N-terminus 
of gpl7 binds is not known. The C-terminus of gpl7 recog¬ 
nizes the receptor on the bacterial surface (57). BothT3 and 
T7 tail fibers use the outer core region of the E. coli Iipo- 
polysaccharide to initiate adsorption. Gene 17 of 4>Ye03-12 
encodes a protein of 645 amino acid residues, more than 
100 residues larger than its T3 and T7 homologs. The N- 
terminal 150 amino acids of 4>Ye03-12 gpl7 show marked 
(78.7%) similarity to the N-terminal part of gpl7 of both 
T3 and T7, though the C-terminal two thirds of the tail fiber 
proteins are much more different (ca. 31% similarity). The 
differences between the C-terminal regions of 4>Ye03-12 
gpl7 and T3 or T7 gpl7 may thus be a reflection of host 
specificity. 

A recombinant T3 phage that carries the gene 17 of 
4>Ye03-12 is able to infect Ye03-c (43). TheT3 recombinants 
were not stable, indicating that the <j)Ye03-12 gpl7 does 
not attach firmly to the T3 particle and that additional 
mutations would be required to stabilize the recombinants. 
Nevertheless, the gene swapping proves that gpl7 plays a 
major role in determining the host range of T 7-group 
phages. However, other gene products, such as the major 
capsid protein gplO, may also play a role. Monospecific anti- 
gplO antibodies are able to neutralize <j)Ye03-12 and also 
T3, although with lower efficiency (41), indicating that 
gplO may have a role in the initiation of the infection. 


Yersiniophage PY54 
Isolation of the Phage 

PY54 is a temperate phage isolated from a nonpathogenic 
Y. enterocolitica 0:5 strain by induction with mitomycin C 
(45). As well as 0:5 strains, pathogenic strains belonging 
to the serogroup 0:5,27 are also infected. Plaques are 
formed at 28 °C but not at 37 °C. 


Phage Characteristics 

The PY54 virion has a phage E-like morphology and 
the phage was classified in the Siphoviridae family (see 
chapter 2). Apart from an obviously identical phage, PY54 
showed no significant DNA homology to other temperate 
yersiniophages (45). The phage genome is a sticky-ended 
double-stranded DNA of 46 kb with 10 nucleotide (-GGG 
ACA GGC A-3' and 5'-TGC CTG TCC C-3') overhangs at 
the 3'-ends (19). 3'-protruding ends are quite uncommon 
for phages of Gram-negative bacteria, and have been 
previously reported for only a few other phages, such as 
Pseudomonas aeruginosa phage D3 (53), Burkholderia mallei 
phage <j)E125 (60), and the E. coli phages HK022, HK97, and 
4>P27 (22, 51). 

Another unusual property of PY 54 is that its prophage 
is not integrated into the bacterial chromosome but repli¬ 
cates as a linear low-copy-number plasmid. The compari¬ 
son of the plasmid prophage with phage DNA isolated from 
particles revealed that these molecules have the same 
size and are nearly 50% circularly permuted. The plasmid 
ends are covalently closed hairpin structures similar to the 
telomeres found in E. coli phage N15 (52) (phage N15 is 
reviewed in chapter 28). Until now N15 was unique in terms 
of its replication as a linear plasmid prophage with termi¬ 
nal hairpins. Similar to PY 54, the N15 genome is a circular 
permutation of its prophage. In the center of the genome, 
N15 contains the telN gene, encoding a protelomerase, 
which cleaves a 56 bp palindrome (telRL) located close to 
telN (14). The staggered cuts yield overhanging single 
strands, which are self-complementary and are able to snap 
back. The DNA strands are then rejoined by the protelo¬ 
merase. Through this cleaving/joining activity, the N15 
plasmid telomeres are generated, each of which is composed 
of one DNA strand of the palindrome. In the central part on 
the PY 54 genome, a 42 bp palindrome and an adjacent 
open reading frame (tel) were identified that showed homol¬ 
ogies to the aforementioned sequences of N15 (figure 34-3). 
The PY54 tel gene was cloned and expressed in E. coli 
and a 77 kDa protein was obtained and purified (19). The 
protein was demonstrated to cleave the 42 bp palindromic 
sequence of PY54, indicating that the phage uses the 
same principle for the generation of the plasmid ends as 
N15. However, in vivo studies with PY54 showed that 
adjacent DNA sequences from the phage are also impor¬ 
tant for successful cleavage. These sequences comprise 
another 15 bp inverted repeat flanking the palindrome that 
is important for cleavage in infected bacteria (figure 34-3). 
The data strongly suggest an essential role of the pro- 
telomerases for conversion of the phage genomes into the 
linear plasmid prophages. Moreover, as already demon¬ 
strated for the N15 protelomerase, this enzyme is also essen¬ 
tial for the replication and maintenance of linear N15 
plasmids (48). 
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A B 

AAATTAGCTAACTCTAGCAACAAGCCAATCAAATAAAGc dAA'TPAAAGTAACCCAlr ACAA 

TrTAATCGA’ITGAGATCGTTGTTCGGTTAGnTATTTCGGTTAATTTCATTGGGTATGTT 


TCATATAAATGCACATTGAAATCTATTCACACATTGAATGGTTACTCATTCACGCATTAT 

AGTATATTTACGTGTAAC'rrTAGATAAGTGTGTAACTTACCAATGAGTAAGTGCGTAATA 


ATAGTC^CCTATTTCAGCATACTACGdGCGTAGTATGCTGAAATAGGTtrACTGTTATTGA 

TATCA arGGATAAAGTCGTATGATGCCgGCATCATACGACTITATCCAR TGACAATAACT 


-35 

CACCTATTCATATAITAATTATTGGTGGTTGA TTGGGTTACnTAATTT GTGTG TTGGAA 

gtggataagtatataattaataaccaccaact 4acccaatgaaattaaK cacacaacctt 


-10 

TGGGTTACGATAAAGATTGAATAATAAATATTCTTGTGATAGTGTGAATGGGTTACTTAT 

ACCCAATC-CTATTTCTAACTTATTATTTATAAGAACACTATCACACTTACCCAATGAATA 

M G Y L> 


RBS 

TATTnTATrr GGAGG CGTTATGAAAATCCATTTTCGCGATTTAGTTAGTGGTTTAGTTA 

ATAAAAATAAACCTCCGCAATACTTTTAGGTAAAAGCGCTAAATCAATCACCAAATCAAT 

lflfggv[m]kihfrdlvsglv> 

AAGAGATCGATGAAATAGAAAAATCAGACCGGGCGCAGGGTGACAAAACTCGG 
TTC1OTAGCTACTITATCTITITAGTCTGGCCCGCGTCCCACTGTTTIGAGCC 
KEIDEIEKSDRAQGDK T> 


5'- TAGTC^CCTATTICAGCATACTACGCCCGTAGTATGCTGAAATAGGTtrACTG- -3' 
3’- ATCAGTGGATAAAGTCGTATGATCCG ^CATCATACGACTTTATCCA VTGAC, -5' 



Right plasmid end 

G s 

5’- TAGTdACCTATTTCAGCATACTAC C 
3'- ATCA GTGGATAAAGTCGTATGATG Q 

c 




Left plasmid end 


G 

C, 

G 


GTAGTATGCTGAAATAGGl TACTG 
CATCATACGACTTTATCCAR T3AC 


-3' 

-5' 


Figure 34-3 Structures of the PY54 plasmid ends. A: Nucleotide sequence of the DNA region upstream from the tel 
gene (AJ348844). Boxed are the 42 bp palindrome, its flanking 15 bp inverted repeat and the putative start codon, 
ribosome binding site (RBS), and putative promotor sequences of the tel gene. B: DNA sequences of the right and left 
plasmid ends. 


Guanine + Cytosine (G-C) Content of PY54 

The complete nucleotide sequence of phage PY 54 has been 
determined (20). The genome has a size of 46,339 bp with 
an average G-Content of 44.6%, which is slightly lower 
than the G-C content reported for the Y. enterocolitica 
genome. However, the G-C content of different PY 54 open 
reading frames (ORFs) vary, ranging from 27.6% to 58%. 
With regard to the G-C values, the phage genome can be 
divided into 3 regions. The ORFs 1-18 (41.4-48.8%) are 
separated from ORFs 20- 32 (39.4-58%) by ORF 19 (27.6%). 
Similarly the latter ORFs are separated from ORFs 34-67 
(34.5-52.3%) by ORF 33 (31.2%). Is it possible that the ORFs 
19 and 33, with a much lower G-C content, are “morons” (17) 
(see chapter 27 for further discussion of phage morons). 

Sequence Analysis of the PY54 Genome 

Sixty-seven candidate genes with good coding potential 
and a start codon (63 with AUG, two with TTG, two with 
GTG) were assigned (table 34.1). For most of the ORFs, a 
plausible Shine-Dalgarno sequence was identified. Fifty- 
five putative PY 54 genes are transcribed rightward (on the 
genetic map) and 12 genes leftward. Bioinformatic anal¬ 
ysis revealed 44 gene products for which homologs were 
found. The tel site of PY 54 is located on the genome simi¬ 
larly to the tel site in N15 and the attachment site, attP, in 
phage X. In analogy to these phages, the PY 54 genome was 
divided into a left and a right arm, separated by the tel 
site (figure 34-4). 


The left PY54 Arm 

As is generally the case for lambdoid phages (see chapter 27), 
the left arm of phage PY54 apparently contains “late genes,” 
coding for structural and assembly proteins. Again consis¬ 
tent with lambdoid phages, the right arm consists mainly 
of regulatory genes important for plasmid replication, 
genes required for host cell lysis, and genes for control of 
DNA methylation. Nevertheless, in contrast to phage N15 
whose left arm is closely related to X-like phages, the actual 
left-arm genes of phage PY54 are much less similar to this 
family. 

The products of ORF 1 and ORF 2, of the left arm are 
probably the small and large subunits of the phage termi- 
nase, respectively. While the predicted ORF 1 product shows 
only weak homology to the terminase small subunits of 
the B. mallei phage <j)E125 and E. coli phage VT2-Sakai, the 
ORF 2 product reveals significant similarity to the termi¬ 
nase large subunits of the phages 4>P27, <j)E125, and D3. As 
noted above, these phages of Gram-negative bacteria share 
the property of 3' protruding DNA ends. The ORFs 3 to 27 of 
PY 54 are suspected to code for proteins required for virion 
head and tail assembly but up to now only eight of the 
deduced gene products could be assigned by their simi¬ 
larities to known structural proteins of other phages or 
by functional analysis. The PY54 major capsid protein is 
most probably encoded by ORF 5. Its deduced product is 
very similar to the major capsid protein of phage <j)E125. 
Moreover, the analysis of the PY54 structural protein pro¬ 
file revealed one major protein band of about 38 kDa (45). 
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Figure 34-4 Arrangement of open reading frames (ORFs) on the PY54 genome. The upper part shows the left arm, separated from the right arm (lower part) by the tel 
site. The ORFs shown in black indicate similarity to known genes as follows: ORF 31, 32, 34-36, 39, 41,43, 44, 52, 54, 57, 59, 64 (N15), ORF 13, 22, 25, 33, 50 
(lambdoid phages), ORF 1-5,8, 11,27, 48, 51,53, 55, 60, 66 (phages of other Cram-negative bacteria), ORF 15 and 67 (phages of Cram-positive bacteria), ORF 24, 28, 
29, 58 (Yersinia), ORF 42, 46, 47, 56, 62 (other bacteria): see table 34-1 for details. ORFs with no database matches are shown in white. The arrows indicate the 
positions of the smallest linear (pSFI95) and circular (pSH 120) miniplasmid. Scans of C-C content across the phage genome are shown below the gene map. See 
thebacteriophages.org/frames_0340.htm for a color version of this figure. 




552 PART V: PHAGES BY HOST OR HABITAT 


This protein was analyzed by peptide mass fingerprinting, 
demonstrating that it is indeed encoded by ORF 5 (20). 

There are two other putative PY54 products (from ORFs 
3 and 4) that are closely related to phage 4>E125 structural 
proteins. The predicted ORF 3 product exhibits signifi¬ 
cant similarity to the portal protein and the ORF 4 product 
to the capsid assembly protein of <j>E125. Therefore it is con¬ 
ceivable that PY54 has acquired a gene cluster by recom¬ 
bination with a 4>E125-related phage that is comprised of 
ORF 3 (or ORF 2) through ORF 5. Two of the predicted PY 54 
proteins (22 and 25) are related to phage X tail proteins 
(host specificity protein J and tail fiber protein, respectively). 
In addition, the putative ORF 13 product is similar to the 
major tail subunit of the X-like phage HK97, but the 
predicted phage PY54 protein is much smaller than the 
related HK97 protein. The other assigned structural pro¬ 
teins (15 and 27) of PY 54 show homologies to the tape mea¬ 
sure protein of the Lactococcus lactis phage TP901-1 and the 
tail fiber assembly protein of phage APSE-1. A PY 54 phage 
mutant harboring a kanamycin resistance gene within 
ORF 15 showed capsids containing DNA but no tails, indi¬ 
cating that this gene is essential for the assembly of the tail. 

There are 17 ORFs in the PY54 left arm which might 
encode proteins involved in phage assembly but for which 
no function could be determined. This is in marked constrast 
to the genome of phage N15, in which genes 2 to 21 are 
correlated with phage X genes with up to 90% identity 
between the predicted amino acid sequences of the head 
gene products (50). Similar to N15, the products of two 
genes (ORF 28 and ORF 29) found at the right end of the 
left arm of phage PY54 are homologs of proteins respons¬ 
ible for plasmid partition. The strongest homologies were 
found to the SpyA and SpyB proteins encoded by the 
Y. enterocolitica virulence plasmid, pYV and, to a slightly 
lesser degree, to the SopA and SopB proteins of phage 
N15, which in turn are closely related to SopA/B protein of 
the F plasmid. These proteins interact with centromere-like 
sites (spyC or sopC) comprising direct or inverted repeats. 
In the F plasmid, sopC is located downstream from sopA/B 
and contains multiple copies of the probable recognition 
site (5'-TGGGACCnnGGTCCCA; n denotes variable bases) 
(32). Scattered on a 13 kb fragment of N15, four inverted 
repeats resembling the sequence shown above were identi¬ 
fied and demonstrated to act as centromeres (16,46). On plas¬ 
mid pYV spyC is located downstream from spyA/B but 
this sequence is composed of direct repeats harboring a 
related recognition site (21, 55). The analysis of the phage 
PY 54 DNA revealed that this phage contains eight identical 
copies of the sopC recognition site of the F plasmid. The 
inverted repeats are scattered over the whole PY54 genome. 
Although a function of these repeats as centromeres has 
not been demonstrated yet, it appears that the centromere¬ 
like sites of PY 54 are more similar to sopC of the F plasmid 
and phage N15 than to spyC of the Y. enterocolitica virulence 
plasmid, pYV 


The PY54 Right Arm 

The right arm of the PY54 genome is considerably more 
similar to N15 DNA than the left arm. Of 36 ORFs identified 
on the right arm of phage PY 54,13 deduced gene products 
(of genes 32, 34, 35, 36, 39,41, 43, 44, 52, 54, 57, 59, and 64) 
show the closest relationship to phage N15 proteins 
(table 34-1). In addition, two other putative PY54 products 
(of genes 48 and 51) are also similar to N15 proteins. Most 
of the related N15 proteins are important for plasmid repli¬ 
cation, are part of the immunity or anti-immunity system 
of N15, or their function is unknown. As noted above, the 
protelomerase of PY54 is closely related (overall simi¬ 
larity 60%) to the protelomerase, TelN, of phage N15. 
Furthermore, both proteins contain a sequence motif at 
the C-terminus which is also present in several integrases 
and proposed to belong to the active center of these enzymes 
(14). It is conceivable that the PY54 protelomerase, like 
its N15 counterpart, is involved in plasmid replication by 
processing of replicative intermediates. 

Another probable replication protein of phage PY54 
is specified by ORF 36. This is the largest gene on the PY 54 
genome and its predicted product shows strong homology 
to and is nearly the same size as RepA of N15. RepA is sug¬ 
gested to be a multifunctional replication protein com¬ 
prising primase, helicase, and origin binding activities. 
This was concluded by similarities to primases of conjuga- 
tive plasmids and the primase of phage P4 (44, 61) (phage 
P4 is reviewed in chapter 26). An essential role of RepA 
in plasmid replication was also demonstrated by the iso¬ 
lation of N15 and PY54 miniplasmids. While a circular 
4.5 kb miniplasmid of PY 54 composed of ORF 36 and part 
of ORF 35 retains replicative competence (19), a 4.2 kb DNA 
fragment of N15 containing the repA gene and 10 bp of 
cB is sufficient to drive the replication of a circular N15 plas¬ 
mid (49). These observations suggest that the rep genes of 
PY54 and N15 have the same function and that the plas¬ 
mid ori sites might be located within the rep genes. 

The sequence analysis of the PY54 genome revealed 
a number of ORFs (genes 39, 41, 43, 44, 47) whose pre¬ 
dicted products are related to N15 proteins encoded by the 
primary immunity region, immB (34). ORF 39 of PY54 
apparently represents the prophage repressor. The pre¬ 
dicted protein is similar to protein CB of phage N15, which 
in turn is related to lambdoid repressors. By cloning ORF 39 
in Yersinia, it has been shown that this gene confers resis¬ 
tance against lytic infection (20). The ORFs 41 and 43 are 
located upstream from ORF 39. Their products are similar to 
the putative N15 antirepressor, Cro, and the transcription 
antiterminator, Q, respectively. 

In contrast to phage N15, the genes 41 and 43 of 
phage PY 54 are not arranged in an operon but instead are 
separated by about 1 kb. The probable function of ORF 41 
as an antirepressor has been previously reported. After clon¬ 
ing this gene in Yersinia, a 10 2 - to 10 3 -fold increased number 



Table 34-1 Bacteriophage PY54 ORF Analysis 


ORF 

Start 

Stop 

Strand 

Mass (l<Da) 

PI 

Functional assignment (Related sequence) 

% Identity 

1 

61 

504 

+ 

16.1 

8.8 

Terminase small subunit (4>E125) 

23 

2 

568 

2301 

+ 

64.3 

6.8 

Terminase large subunit (4>P27, SfV, 4>E125, D3) 

55/53/51/40 

3 

2492 

3751 

+ 

47 

8.5 

Portal protein (cpEI 25, <|>P27, HK022) 

49/35/29 

4 

3906 

4817 

+ 

33 

4.9 

Capsid assembly protein (4>E125) 

43 

5 

5068 

6168 

+ 

38.7 

5.1 

Major capsid protein (cj>E125) 

48 

6 

6574 

6158 

- 

14.1 

4.1 



7 

6234 

6800 

+ 

19.7 

4.4 



8 

6778 

7110 

+ 

12.5 

4.1 

Unknown (SfV ORF 6 product) 

30 

9 

7110 

7463 

+ 

13.1 

9.2 



10 

7402 

7824 

+ 

16.2 

5.5 



11 

7830 

8237 

+ 

14.9 

9.8 

Unknown (D3 ORF 14 product, HK97 gpIO) 

29/27 

12 

8280 

8732 

+ 

15.8 

4.4 



13 

8708 

8992 

+ 

9.6 

4.7 

Major tail subunit (HK97) 

35 

14 

9001 

9372 

+ 

13.3 

6.3 



15 

9605 

12868 

+ 

115.8 

6.5 

Tape measure protein (TP901-1) 

26 

16 

12522 

12749 

+ 

8.9 

9.3 



17 

12872 

13468 

+ 

22.1 

9.3 



18 

13468 

14052 

+ 

21.2 

5.8 



19 

14113 

14583 

+ 

18 

6.4 



20 

14605 

15006 

+ 

14.5 

7.2 



21 

15036 

14794 

- 

8.7 

10.1 



22 

14999 

17542 

+ 

92.9 

4.7 

Tail protein (7 host specificity protein) 

27 

23 

17269 

16949 

- 

12 

9.6 



24 

17553 

18575 

+ 

36.4 

8.8 

Unknown ( Y. pestis hypothetical phage protein) 

32 

25 

18588 

20660 

+ 

71.1 

4.6 

Tail fiber protein (7) 

44 

26 

19402 

19139 

- 

8.5 

3.6 



27 

20660 

21067 

+ 

15.1 

4.0 

Tail fiber assembly protein (APSE-1) 

36 

28 

22316 

21135 

- 

43.9 

8.0 

Partitioning (Y. ent. SpyB, N15 SopB) 

56/54 

29 

23482 

22313 

- 

43.4 

5.3 

Partitioning (Y. ent. SpyA, N15 SopA) 

85/70 

30 

22418 

22765 

+ 

13.2 

8.4 



31 

23858 

24022 

+ 

6.1 

11.1 

Inhibitor of cell division (N15 led) 

59 

32 

24604 

26490 

+ 

71.8 

5.6 

Protelomerase (N15) 

40 

33 

26915 

26544 

- 

14.3 

4.6 

Unknown (P22 Eaf protein) 

55 

34 

27257 

26928 

- 

12.9 

8.8 

Unknown (N15 gp33) 

45 

35 

27615 

27304 

- 

12.2 

5.1 

Unknown (N15 gp35) 

35 

36 

31623 

27631 

- 

148.8 

7.8 

DNA replication (N15 RepA) 

49 

37 

29846 

30091 

+ 

9.1 

12.2 



38 

31862 

32131 

+ 

10.6 

8.4 



39 

32524 

31877 

- 

24.1 

8.2 

Prophage repressor (N15 CB) 

35 

40 

32319 

32564 

+ 

8.9 

11.7 



41 

32629 

32832 

+ 

7.9 

10.1 

Cro repressor (N15) 

31 

42 

32747 

33451 

+ 

26.2 

6.7 

Unknown (X. fastidiosa hypothetical protein) 

29 

43 

33452 

34165 

+ 

26.6 

9.4 

Antiterminator Q (N15) 

31 

44 

34077 

34427 

+ 

13.3 

9.1 

Unknown (N 15 QD1) 

52 

45 

34430 

34672 

+ 

9.0 

10.1 



46 

34654 

35631 

+ 

36.2 

4.9 

Unknown (P. multocida recombination associated protein) 

26 

47 

35634 

35834 

+ 

7.6 

9.9 

Unknown (N15 QD1) 

35 

48 

35883 

36410 

+ 

19.5 

4.4 

Unknown (CP-9330 gene Z2097 product) 

43 

49 

36407 

36706 

+ 

10.9 

3.9 



50 

36571 

37038 

+ 

17.6 

10.3 

Unknown (FIK022 hypothetical protein) 

50 

51 

37042 

37614 

+ 

20.7 

4.8 

Exonuclease (CP-9330, CP-933P) 

53/52 

52 

37538 

37978 

+ 

16.6 

5.5 

Unknown (N15 gp41) 

40 

53 

37972 

38382 

+ 

15.7 

9.6 

Unknown (CP-933V gene Z3348 product) 

46 

54 

38366 

38596 

+ 

8.4 

10.5 

Unknown (N15 gp47) 

47 

55 

38596 

39228 

+ 

23.8 

5.9 

DNA adenine-methylase (SfV) 

31 

56 

39315 

39551 

+ 

8.8 

5.1 

Unknown (S. sonnei ImpC) 

42 

57 

39601 

39864 

+ 

10.1 

5.5 

Antirepressor (N15 AntA) 

35 

58 

40325 

39891 

- 

15.7 

5.0 

Unknown (Y. pestis hypothetical protein) 

48 

59 

40722 

40955 

+ 

9.1 

4.8 

Unknown (N15 gp51) 

28 

60 

41231 

42304 

+ 

40.9 

6.1 

DNA adenine-methylase (SfV, CP-9330, cj>P27) 

60/60/58 

61 

42687 

42953 

+ 

9.9 

6.5 



62 

42953 

43486 

+ 

19.5 

9.4 

Lysin (S. dysenteriae, 933W) 

67/64 

63 

43479 

44000 

+ 

18.6 

8.2 



64 

44029 

44355 

+ 

11.9 

6.3 

Unknown (N15 gp57) 

60 

65 

44442 

44837 

+ 

14.3 

4.9 



66 

45025 

45387 

+ 

13.8 

10.0 

Unknown (SfV ORF 53 product) 

59 

67 

45456 

45830 

+ 

13.9 

7.0 

Unknown (D29 gp 64) 

51 


553 
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of plaques were obtained upon infection with phage PY 54 
(20). Further upstream on the PY 54 genome there are two 
other ORFs (44 and 47) with homologies to the immB 
region of N15. Their products share similarity to the ORF 
OD1 product of phage N15, whose function is unknown. 
Two of the probable PY54 gene products (31 and 57) show 
homologies to N15 proteins encoded by the anti-immunity 
locus immA, but the homology of the ORF 57 product 
is rather weak. 

The immA locus of N15 is involved in the choice be¬ 
tween the lytic and lysogenic cycle and contains three 
genes: icd. antA, and antB (47). The genes are part of an 
operon and are transcribed from right to left. Interest¬ 
ingly, ORF 31 and ORF 57 of PY54, which resemble icd 
and antA, respectively, are transcribed from left to right 
and are a long way apart. ORF 31 is located at the right end 
of the left PY54 arm adjacent to the plasmid partitioning 
genes, whereas ORF 57 is located on the right arm about 15 
kb upstream from ORF 31. The N15 proteins led and AntA are 
an inhibitor of cell division and an antirepressor, respec¬ 
tively (47). An antirepressor function has also been shown 
for the antA-related PY54 ORF 57 (20). For that reason, 
PY 54 harbors at least two genes (ORF 41 and ORF 57) that 
might act as antirepressors. 

The functions of the remaining putative PY54 pro¬ 
teins that are similar to N15 proteins cannot be predicted 
as yet. The products of the PY54 ORFs 34, 35, 52, 54, 59, and 
64 make only convincing database matches to the N15 
products 33, 35, 41, 47, 51, and 57, respectively, whose func¬ 
tions are still unknown. However, the fact that the order 
and orientation of the corresponding genes in PY54 and 
N15 are identical and that, with the exception of the ORF 
54 product, the other PY54 products are similar in size 
to their N15 counterparts suggests a strong relationship. 

Besides ORFs encoding N15-related proteins, the right 
PY54 arm contains a number of ORFs whose deduced 
products resemble proteins of other organisms. There are 
two ORFs (55 and 60) which obviously specify adenine- 
specific methyltransferases. The strongest similarities 
were detected to enzymes encoded by the Shigella jlexneri 
phage SfV ORF 51 may code for a product similar to exo¬ 
nucleases of cryptic E. coli prophages. The predicted product 
of ORF 46 yields database matches to recombination- 
associated proteins of Pasteurella multocida and Haemophilus 
influenzae. A probable lysis protein closely related to endoly- 
sins of Shigella dysenteriae and E. coli phage 933W may 
be encoded by ORF 62. Finally, the right arm of phage 
PY54 comprises some ORFs whose predicted products are 
similar to hypothetical or unknown proteins of other 
bacteria and phages (table 34.1). 

Conclusion 

The analysis of the PY54 genome reveals that this phage 
is not simply a derivative of phage N15 with an altered 


host range but a unique phage which has evolved inde¬ 
pendently. The genes on the left arm of PY54 are much 
more divergent from E-like genes than those of N15. 
Although it can be assumed that this part of the PY54 
genome comprises structural genes, many predicted gene 
products could not be assigned functions. In contrast, a 
significant number of genes located on the phage PY54 
right arm show striking similarities to N15 genes. The prob¬ 
able products of these related genes are mainly involved 
in plasmid replication or have regulatory functions. Some 
of the genes have unknown functions. It can be suggested 
that the genes which are similar in PY54 and N15 
were acquired by horizontal exchange. However, the right 
PY54 arm additionally contains a number of genes quite 
different from those of phage N15, confirming the genetic 
mosaicism demonstrated for lambdoid phages (22) (see 
chapter 27). 

While this chapter was being completed, another linear 
plasmid prophage, K02, was reported for Klebsiella oxytoca 
(13). Phage K02 has not been propagated on indicator 
strains as yet. It has a genome of 51,601 bp, of which 64 
genes have been predicted. The sequence analysis disclosed 
a strong relationship to N15. Some regions of the genome, 
for example genes 1 to 15 (head and tail shaft), 24 and 25 
(partitioning), 26 (protelomerase), and 35 (replicase), are 
similar to genes of phage PY54. The data imply that K02 
has a similar life-style to N15 and PY54 and suggests 
that the number of phages replicating as linear plasmids 
with covalently closed ends will continue to increase in the 
next future. 
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B acillus subtilis strains, like many other bacteria, con¬ 
tain temperate bacteriophage genomes within their 
chromosomes. Some of these bacteriophages form viable 
infective particles upon induction, for example 4>105 and 
SP |3. Other bacteriophages, such as PBSX, appear to be 
remnants of intact viruses that no longer have a complete 
genome but package fragments of the bacterial chromosome 
during phage maturation. 

B. subtilis temperate bacteriophages come in a variety 
of shapes and sizes, which will be detailed in the next 
section. Despite physical differences, these phages have one 
thing in common: only linear, double-stranded DNA, be it 
bacteriophage or bacterial chromosome in origin, is pack¬ 
aged in the phage particle. There have been no reports of 
single-stranded DNA phages in Bacillus species, and only 
one report of an RNA phage, AP50, which is specific for 
Bacillus anthracis (89). 

A chapter on B. subtilis temperate bacteriophages 
appeared in the first edition of The Bacteriophages (151). 
Since that time, a wealth of information has been reported 
about these phages. New isolates have been identified, 
phage-specific gene products have been characterized, and 
the DNA sequences of the entire SP(3 and PBSX genomes 
have been determined. 


Physical Characteristics 
of Bacteriophages 

B. subtilis temperate phages are organized into five groups 
based on phage immunity, serology, host range, virion and 
genome size, DNA homology, and restriction fragment 
maps (21, 143, 151, 152). Table 35-1 indicates key features 
that define the groups and lists well-characterized members 
of each group. Bacillus phages in groups I, II, and III, which 
have long noncontractile tails, belong to the viral family 
Siphoviridae. Group V phages, with contractile tails, belong 
to the family Myoviridae (88). Group V phages are distinct 
from other phage types in that they do not package their 


own genomes during lytic growth, but incorporate small 
fragments of chromosomal DNA into their heads (96). See 
chapter 2 for a general discussion of phage classification 
based on virion morphology. 

Phage Serology, Immunity, and Host Range 

Antiserum prepared against one member of a group neutral¬ 
izes other phages of the same group, but not phages in 
different groups. For example, antiserum raised against 
phage 4>105 (group I) does not react with 4>3T (group III) 
but does neutralize p6, another group I phage. Within the 
group III phages, it appears that there may be more than 
one serological group. Antiserum raised against SPP neu¬ 
tralizes SPP itself as well as phages ())3T, pll, IG1, IG3, and 
IG4 but not SPR or H2 (36, 139). Anti-SPR antiserum neu¬ 
tralizes only SPR (92), and anti-H2 antiserum inactivates 
only H2 (138). 

Similarly, when a host cell is lysogenized by a phage, it 
is immune to infection by other phages in the same group 
but may be infected by phages of another group. An excep¬ 
tion to this general rule is seen with B. subtilis strain 168, 
which carries group III SPP as an endogenous prophage. 
Group III phages (j)3T and pll form plaques on this strain, 
but phages Z, SPR, IG1, IG3, IG4, and H2 do not (21, 36). 
There must thus be two distinct immunity types within the 
group III phages. Based on host range and DNA homology 
studies, Weiner and Zahler (139) proposed that group III 
phages be subdivided into three subgroups, with SPP, cj>3T, 
pll, IG1, IG3, and Z in subgroup 1, SPR in subgroup 2, and 
H2 in subgroup 3. 

B. subtilis strain 168 serves as a host for phages from 
each of the five groups. This strain normally carries SPP 
and the defective phage PBSX as resident prophages (21, 
121, 137). B. subtilis strain R is lysogenic for SPR, which 
was initially misidentified as SPP (130). Only phage SP16 
forms plaques on B. subtilis strain W23. In addition, this 
phage forms plaques on Bacillus amyloliquefaciens strain H, 
Bacillus pumilis. Bacillus lichenifonnis, and Bacillus globigii, 
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Table 35-1 Physical Characteristics of B. subtilis Bacteriophages 


Group 

Phage 

members 3 

Head size 
(nm) 

Tail size b 
(nm) 

DNA length 
(kb) 

DNA %G+C 
(mol%) 

References 

1 

4>105, p6, pi 0, pi4 

50-52 

10x220 (N) 

38-40 

43.5 

7, 13 

II 

SP02 

50 

lOx 180 (N) 

40 

43 

8, 13, 14, 105 

III 

SP|5, 4>3T, Z, 
pi 1, SPR, 

IG1, IG3, IG4, 

H2 

72x82 

12 x358 (N) 

110-134 

31 

22, 28, 35, 36, 

54, 85, 132, 137, 
153, 155 

IV 

SP16 

61 x 61 

12 x 192 

60 

37.8 

84, 97 

V 

PBSX, 

PBSW, PBSY, PBSZ 

45x45 

20 x 200 (C) 

30, 13 kb bacterial 
DNA packaged 

43° 

2, 121, 144 


a Typical phage for each group is indicated by bold-faced type. 

b N, noncontractile tail; C, contractile. 

c Mol%G+C of packaged B. subtilis chromosomal DIMA (69). 


which are close relatives of B. subtilis (21, 84). Phage H2 also 
lysogenizes Bacillus amyloliquefaciens strain H and Bacillus 
pumilis (138). 

B. subtilis strainW23 carries defective phage PBSZ, which 
resembles PBSX (137). After induction and lytic growth, 
these defective bacteriophages bind to other nonlysogenic 
Bacillus strains and kill them. The term “phibacin” (phage¬ 
like bacteriocin) has been coined to describe the killing 
effect of phages such as PBSX (82). 

Particle and Genome Sizes 

Group I and If phages have icosahedral heads about 50 nm 
in diameter, whereas group III and IV phages have some¬ 
what larger heads (see table 35.1). All these phages, with the 
possible exception of SP16, have long, thin, flexible noncon¬ 
tractile tails. Figure 35.1 is an electron micrograph of the 
group III phage SP(3 that demonstrates the typical appear¬ 
ance of B. subtilis temperate phages. At the end of the tail 
there is a six-lobed foot structure that probably functions in 
attaching the phage particle to its receptor in the bacterial 
cell wall. 

Additional Bacillus temperate phages have recently been 
reported by Ackermann et al. (1). One of these, phage species 
SN45, has an elongated head (80 x 40 nm) that encom¬ 
passes its 39 kb genome. At the end of its 287 nm long tail 
there is a baseplate resembling that of 4>105, leading to spec¬ 
ulation that the two phages might be related. 

The linear, double-stranded DNA genomes of the phages 
in groups I and II are about 40 kb long, while the SP16 
genome is larger, about 60 kb. Phages in group III have 
genomes greater than 110 kb in length, about 2-2.5 times 
longer than that of bacteriophage X. Group III phage 
genomes contain genes essential for phage propagation, 
prophage maintenance, and several additional genes with 
diverse functions (see below). 

The DNA sequence of the entire SP(3 genome has been 
determined. It consists of 134,416 bp (74). The genome 



Figure 35-1 Electron micrograph of a negatively-stained 
preparation of SP(3 particles. The long, flexible, non¬ 
contractile tail ends in a complex structure that probably 
is involved in phage attachment to the bacterial cell. Bar 
represents 100 nm. 

contains 187 putative open reading frames (ORFs), many 
of which are less than 100 amino acids long. A recently 
completed peptide profile of the B. subtilis genome revealed 
that small peptides, that is peptides encoded by fewer than 
85 codons, are commonly organized into clusters as part 
of a prophage genome (156). The function of these small 
peptides is for the most part unknown; however, sspC, 
encoding a 72-amino acid small, acid-soluble spore protein 
(16) is one of the ORFs in the SP(3 genome. 

The ORFs are divided into clusters based on function. 
Cluster I [nucleotide (nt) 41-21224] comprises ORFs adjacent 
to the left prophage attachment site, which may be involved 
in phage integration and bacteriocin expression. Cluster II 
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(nt 21280-64109) contains genes possibly associated with 
late phage function, such as cell lysis and perhaps phage 
structural proteins. Cluster III (nt 65062-134361) contains 
genes that act early in the phage life cycle, for example 
the putative repressor gene, yonR. 

The SP|3 sequencing project revealed that the %G + C 
content of the phage was on average 34.6 mol%, which corre¬ 
lates well with the previously reported value (137). Because 
of the variance in %G + C of phage DNA versus that of chro¬ 
mosomal DNA, which is on average 43.5 mol% (69), there 
is a notable difference in codon usage by the phage. Phage 
genes seem to use TTG, GTG, CTG and even ATT as start 
codons, as well as the common ATG. Moszer et al. (87) 
propose that the presence of codon bias is an indicator of 
systematic lateral gene transfer in B. subtilis that might 
result from prophage integration. 

For the group V phages, determination of genome size is 
slightly more difficult because these phages package chro¬ 
mosomal DNA fragments, not their own genomes. Although 
PBSX packages 13 kb of chromosomal DNA (2), Wood et al. 
(144) determined that the PBSX genome itself is about 30 kb 
long by sequencing chromosomal DNA encompassing the 
PBSX prophage. The genome comprises at least a repressor 
gene ( xre ), phage head and tail proteins, and cell lysis func¬ 
tions (see below). 

A phage resembling PBSX, PBND8, has been isolated 
from the related bacterium, Bacillus natto (131). The overall 
phage structure is similar but PBND8 has a smaller head 
containing a correspondingly shorter length of chromo¬ 
somal DNA, only about 8 kb. It appears that the group V 
phages package specific lengths of chromosomal DNA 
based on the size of the phage head. 

The head sizes of phages in groups I-IV accommodate 
a phage genome length of DNA. Occasionally, <j)105 pack¬ 
ages about 1 kb more than its genome length of 39.2 kb (32) 
without adverse effects on the phage head. Sometimes 
phages carry DNA molecules that have lost between 2 and 
14 kb from their genomes. Viable deletion mutants have 
been identified for phages <f>105 (43), SP02 (50), pll (63), 
pl4 (68), and SPP (39, 41, 118). These deletion mutants 
define specific regions not essential for phage propagation 
or lysogeny, providing useful information for the construc¬ 
tion of phage cloning vectors (see below). 

The ends of the linear phage DNA molecules examined 
thus far are of two varieties. SP16 contains terminally 
redundant, circularly permuted double-stranded ends (97). 
This suggests that the phage packages its DNA using a 
headful mechanism like that of Salmonella phage P 22 (113) 
(reviewed in chapter 29). Other phages, including <f>105 (111), 
and SP02 (13), have constant cohesive ends resembling 
those of phage 7, (see chapter 27) except that the Bacillus 
phage DNAs have 3' single-stranded extensions rather 
than 5'. This type of specific single-stranded extension is 
probably generated, as it is in X, by a precise endonucleolytic 
cleavage that occurs during phage packaging (11). The ends 


of the 4>105 genome have the sequences 5'-GCGCTCC-3' and 
3'-CGCGAGG-5' (29). Regardless of the end structure of the 
phage genome, these ends must join to convert the linear 
molecule into a circular one prior to prophage integration. 

Establishing and Maintaining Lysogeny 

Adsorption to Cells 

To initiate infection, B. subtilis temperate phages first bind 
to the bacterial cell. Proteins at the end of the phage tail 
adsorb to specific receptors on the surface of the host cell. 
Phage-resistant B. subtilis mutants have been isolated; most 
of the defects map to the gtaB-tagB region of the chromo¬ 
some, which contains genes involved in teichoic acid synth¬ 
esis. One such mutant, pha-3, eliminates adsorption by all 
group III phages except SPR (33). gtaB encodes UDP-glucose 
pyrophosphorylase, which functions to glycosylate cell wall 
teichoic acids (136). Different mutations in gtaB alter or 
eliminate bacteriophage (j>3T adsorption (116). Another 
locus, gneA, located between sac A and purA, encodes UDP-N- 
acetylglucosamine 4-epimerase (34). Mutations in this gene 
lead to cells with galactosamine-deficient teichoic acid, and 
these cells are also resistant to (j)3T adsorption. 

The defective B. subtilis phages also bind to teichoic 
acid in the cell walls, but with different specificities. PBSX 
does not bind to host B. subtilis strain 168 cell walls, but 
will attach to non-host strain W23 cell walls to initiate 
killing of the non-host strain. Conversely, phage PBSZ will 
only adsorb to non-host strain 168 cells and not to host 
strain W23 cells. Karamata and coworkers (61, 149) found 
that the tagl gene in strain 168, mediating glycerol teichoic 
acid synthesis, can be replaced by the tar gene from strain 
W23, which specifies ribitol teichoic acid synthesis. Such 
interstrain hybrids show the phage sensitivity pattern of 
strain W23, indicating that the specificity of phage bind¬ 
ing depends on whether the cells have ribitol or glycerol 
teichoic acid in their walls. 

Attachment Sites 

Bacillus phage genomes insert into the bacterial chromo¬ 
some by a Campbell-type, single crossover event (10) (see 
also chapter 7). The linear phage DNA circularizes, then 
integrates at its specific attachment site, attB (140). The 
known chromosomal locations of phage integration sites 
are diagrammed in figure 35.2. The c|)105, SP02, PBSX, and 
H2 attachment sites occur at different locations in the 
chromosome (20, 47, 58, 108, 115, 144, 153). In contrast, the 
attachment sites for phages <f>3T, SPp, IG1, IG3, and IG4 
are clustered between ilvA and gltA, near the terminus of 
DNA replication (37, 56, 142, 154). These different attach¬ 
ment sites may represent remnants of ancestral prophage 
DNA (103). 



560 PART V: PHAGES BY HOST OR HABITAT 


rpIV 

( 138 . 5 ) 



Figure 35-2 Location of bacteriophage attachment sites in 
the B. subtilis chromosome. Genetic landmarks and approx¬ 
imate location of prophage attachment sites are indicated. 
Numbers refer to kilobase position in B. subtilis genome (69). 


B. subtilis strain 168 contains SPP and PBSX as endo¬ 
genous prophages, but strains have been constructed that 
lack both prophages. These cured strains recombine, repair 
damaged DNA and sporulate as well as the lysogenized 
strain (148). In addition, despite the fact that SP P lies 
near the terminus of chromosomal DNA replication, elimi¬ 
nation of the SP(f prophage does not alter where chromo¬ 
somal DNA replication ends (59). 


For phage SPp, the bacterial ( attB ), phage ( attP ), and 
prophage (attL and attR) attachment-site sequences have 
been identified (12, 74, 146); they are diagrammed in 
figure 35-3. The SPP attachment sites resemble those identi¬ 
fied for coliphage X. The X attachment sites contains an 
AT-rich core of identical nucleotides, and the X attP core is 
flanked by regions of high similarity (140). The SPP core 
region is also AT-rich (30%G + C), having the nucleotide 
sequence 5'-ATACAGCTTTATCTGT-3'. Segments flanking 
the SPP attP core contain inverted repeats (figure 35-3, 
arrows) that resemble integrase binding sites in phage X 
DNA (107). There are also AT-rich regions in attP, which 
might correspond to integration host factor protein (IHF) 
binding sites (104). IHF binding at phage X attP results 
in DNA bending, facilitating site-specific recombination 
mediated by X integrase (86). 

An integration-deficient variant of phage SPP, the 
int-5 mutant, has been isolated (150). This phage rarely 
inserts at the normal SPP attB adjacent to ilvA, but inserts 
at other sites in the B. subtilis genome. These alternative 
insertions may result from recombination between SPP 
DNA and regions of chromosomal DNA that have strong 
homology to the phage genome (74, 103). Apparently the 
lack of integrase function also affects excision of the SPP 
int-5 mutants. These mutants are not efficiently induced, 
and when they do excise from the chromosome, they delete 
bacterial genes adjacent to their insertion sites. For example, 
excision of the SPP int-5 phage integrated at its normal 
attachment site can result in the loss of DNA from the pro¬ 
phage to beyond terC, a deletion of about 230 kb (57). 

Phage Repressor and Genetic Controls 

To maintain lysogeny, prophage gene expression must 
be controlled by a repressor protein. 4>105, SPP, p6, and 
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5.ATATGTAGTAAGTATCTTAATATACAGCTJJAJCJGTmTTAAGATACTTACTACTTTTC...3' 
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Figure 35-3 DNA sequence of SPp attachment sites. Attachment sites are those associated with the bacteriophage (attP sPp ) 
the bacterial chromosome (attB sP p) and the prophage (attL sPp and attR sP p). Phage DNA sequences are italicized. Core region 
is in bold-face type, arrows denote inverted repeats, AT-rich regions overlined. 
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4>3T phages carrying defective repressor (c) genes have been 
isolated (18, 21, 137). With a nonfunctional repressor, these 
phage cannot lysogenize their hosts, yielding a clear-plaque 
morphology. 

The gene encoding the 144 amino acid <f>105 repressor 
protein, c^qs, has been cloned and sequenced (19, 25). The 
N-terminal end of the repressor contains a helix-turn-helix 
(HTH) motif similar to the X cro repressor (133). <j)105 
mutants with alterations occurring within the first 43 resi¬ 
dues of the protein cannot bind to phage promoter regions. 
The repressor functions as a tetramer, binding to operator 
sites in the <j>105 immunity region (immF). 

The organization of the 4>105 genetic control region is 
diagrammed in figure 35-4A. As with phage X repressor, 
the c|)105 repressor interacts with two divergent promoters 
in the immunity region. Transcription from P R leads to 
synthesis of late phage genes and lytic growth, while tran¬ 
scription from P M leads to synthesis and lysogeny. 

By constructing fusions between these promoters and 
the cat- 86 gene, Van Kaer and coworkers (134) demonstrated 
that the 4>105 repressor stimulates expression from P M 
and represses expression from P R . The repressor protein 
binds to six operator (0 R ) sites, three of which (0 R 1, 0 R 2, 
and 0 r 3) have a common sequence, 5'-GACGGAAATA- 
CAAG-3', which is unusual in that it does not show 2-fold 
rotational symmetry (134, 135). 0 R 4 and 0 R 5 differ from 
this consensus sequence in two residues and O r 6 differs 
at five sites. All the operator sequences lie within the app¬ 
roximately 200 bp region between the two promoters with 
the exception of O r 3, which occurs about 250 bp down¬ 
stream of P R . The 4>105 repressor binds tightly to 0 R 1, O r 2, 
O r 3, and O r 6; binding of the repressor at these sites 
precludes transcription from P R . Weaker binding occurs at 
0 r 4 and 0 R 5; repressor binding at these sites may auto- 
regulate repressor expression. 
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Figure 35-4 Organization of genetic control regions. 
A: Phage 4)105. B: Phage PBSX. Small arrows indicate 
orientation of operator sequences. For further details, 
see (135) and (81) for <4>105 and PBSX, respectively. 


The PBSX repressor gene, xre, has also been cloned 
and sequenced (145). The repressor (Xre) is 113 amino acids 
long and contains an HTH motif at its N-terminal end. A 
temperature-sensitive repressor mutant, xhil479 (9), differs 
from the wild-type protein at three residues, one of which 
is in the HTH region. The promoter region of the xre gene 
is diagrammed in figure 35-4B. Another gene, encoding 
ORF10, is adjacent to xre but divergently transcribed, which 
resembles the situation in the 4>105 immF. Between xre 
and ORF10 there are four 15 bp palindromic sequences, 5'- 
GATACATTTTGTATC-3', which bind to purified Xre protein 
(81, 82). These sites have been called 01, 02, 03, and 04. At 
low concentrations, Xre binds first to 01 and 02 to prevent 
transcription rightward through ORFIO and the late genes, 
but at higher concentrations it can prevent its own syn¬ 
thesis by binding at O 3 and 04 (81). 

In the SP|3 genome, the yonR ORF has been assigned 
as the repressor gene by virtue of its homology with the 
PBSX xre gene (74). Another gene involved in SPP immu¬ 
nity, d or yom], was previously cloned and shown to confer 
SPP immunity to the host cell (80). However, the d gene 
does not complement SPP c mutants; thus it is not the main 
phage repressor gene. A similar situation occurs in 4>105, 
where genes on two different cloned DNA fragments can 
confer phage immunity to the host, but only one of the 
genes actually produces a protein that binds to 4>105 opera¬ 
tor sequences (24). 


Prophage Induction and 
Specialized Transduction 

Prophage Induction 

All B. subtilis temperate phages are induced to enter lytic 
growth when the host cell DNA has been damaged. Physical 
and chemical treatments, including mitomycin C, UV light 
and N-methyl-N-nitrosoguanidine (NTG), stimulate the 
cellular SOS-like response, which has been called the 
SOB regulon in B. subtilis (147). Expression of the B. subtilis 
RecA protein is stimulated following DNA damage. In the 
presence of RecA, prophage repressor proteins are inacti¬ 
vated, resulting in phage induction. The phage genome 
excises from the chromosome by a reversal of the site- 
specific recombination event that led to phage integration 
(for review of related regulatory circuitry as displayed by 
phage X, see chapter 8). 

Some phages are induced by other cell-damaging treat¬ 
ments, such as exposure to hydrogen peroxide, stimulates 
growth of PBSX and possibly SPP (53, 122). In addition, 
IG1 is induced by the DNA polymerase III inhibitor, 6-(p- 
hydroxyphenylazo)-uracil, and can multiply effectively 
despite a drastic decline in host cell viability (38). When 
B. subtilis cells develop competence, some of the SOB-related 
genes, such as recA. are activated. Prophages of 4>105 and 
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SP02 are induced during competence, but <f>3T and SP(3 
prophages have developed a mechanism to prevent induc¬ 
tion during competence development (83). 

Temperature-sensitive repressor mutations have also 
been isolated for 4>105 [cfs23, (3)], SPP, [c2, (106)], and PBSX 
[xlul479 (9, 145)]. Following a brief heat shock (48-50°C), 
the repressor function is destroyed and the prophage enters 
lytic growth. 

During sporulation, the skin (sigK intervening) element, 
a 42 kb long segment of the chromosome at position 230°, 
is released by a site-specific recombination event resem¬ 
bling prophage excision (124). DNA sequence analysis of the 
skin element reveals strong homology between it and PBSX, 
including genes for a potential repressor and an auto- 
lysin (67). Despite the high degree of similarity between the 
two DNAs, the skin late gene operon is not expressed and no 
phage particles are produced. The skin element may repre¬ 
sent a remnant of some ancestral prophage. 

Specialized Transducing Phages 

Occasionally prophage excision is not precise and recom¬ 
bination takes place between phage DNA and adjacent 
chromosomal DNA, which results in the formation of specia¬ 
lized transducing particles (150). <f>105 mediates specialized 
transduction of genes flanking its attachment site, that is 
ilvBC-leu and pheA, but at low frequency and only by replace¬ 
ment of defective alleles in recipient cells (114). 

In contrast, SPP mediates specialized transduction of 
genes flanking its normal attachment site, that is HvD-thijB- 
ilvA and kauA-odh (citK), but it generates high frequency 
of transduction (HFT) lysates (40, 106, 154). These HFT 
lysates have transducing particle-to-phage particle ratios 
between 1:10 2 and 1:10 4 . When SPP integrates at second¬ 
ary sites in the chromosome, it also mediates specialized 


transduction of genes flanking the prophage, for example 
ilvBC-leu, dad-ddl, glnA, and degU (46, 77, 79, 100). Phages 
IG1, IG3, IG4, H2, and 4>3T also mediate specialized trans¬ 
duction of genes adjacent to their chromosomal attachment 
sites (37,95,153). 

Only one nondefective specialized transducing particle 
has been reported SPP :SPP c2 pilvA (42). Figure 35-5 depicts 
how this specialized transducing phage may have been 
generated. DNA sequence data obtained from the SPP 
c2 pilvA phage (4) have been compared with the 
DNA upstream of ilvA and to SPP sequences (69, 74). 
Apparently there is a region of approximately 27 bp in the 
bacterial chromosome upstream of ilvA [nt 2293436- 
2293462 (69)] that is very homologous to a part of SPP 
near the right end of the prophage [nt 126571-126588 (74)], 
with 22 of 27 bases identical. Possibly this short region of 
homology was enough to serve as a recombination site 
during the genesis of the SPP c2 pilvA phage. This finding 
lends support to the notion that ancestral phage DNA exists 
in the chromosome (103). In the SPP c2 pilvA particle, 
approximately 8 kb of the phage genome are replaced by 
DNA from the bacterial chromosome; prophage genes to 
the right of yosR do not appear to encode essential phage 
proteins. 


Phage-Specific Genes 
DNA Methyltransferases 

The genomes of group III phages 4>3T, SPp, pll, SPR, and 
H2 contain methyltranferase encoding genes, which 
are expressed during vegetative growth of the phages. 
Phage-derived methyltransferase (MTase) activity was first 
observed in phages 4>3T and SPR, when it was discovered 
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Figure 35-5 DNA content of specialized transducing bacteriophage SPp c2 p//vA. Phage DNA sequences are italicized. 
Indicated kilobases denote locations in the B. subtilis chromosome (69). 
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that the phage DNAs contained Haelll (BsuRI) restric¬ 
tion sites that were methylated during phage growth 
(17, 130). MTase modification may serve to protect the 
phage genomes from DNA restriction as they pass into 
new hosts. 

The phage-encoded MTases are all type II, requiring 
S-adenosylmethionine as the methyl group donor (see 93 
for a review), but some of these phage MTases are unique in 
that they modify multiple recognition sites. The phage 
MTases mediate 5-C methylation of a C residue in their 
respective target sequences. Most of the phage MTases 
modify the Haelll (BsuRI) site, GGCC, methylating the 
internal C residue (92). In addition, the SP(3, <f>3T, and 
pll B MTases modify the Fuu4HI site, GCNGC, again at 
the internal C residue (70, 128). The three genes encoding 
these MTases have been isolated and expressed in Escheri¬ 
chia coli, and the proteins are all very similar in size 
(~47 kDa) and structural relatedness (91). The MTase from 
a pll variant, pll s modifies the Haelll sequence, but 
also acts on the Bspl286 site, G (A/G/T)GC(T/C/A)C; modi¬ 
fication of this sequence has not been completely chara¬ 
cterized (6). 

The SPR MTase has a more complex nature because 
it modifies three sites: Haelll, Hpall (CCGG), and EcoRII 
(CCA/TGG). It may modify both cytosine residues, or only 
the internal cytosine of the Haelll sequence (52, 129). 
The H2 MTase also has triple sequence specificity, 
methylating the Haelll, F/U/4HI, and Bspl286 sites (71). 
In addition, H2 carries a separate, second MTase gene 
that protects Ba/uHI sites (GGATCC) in the phage, modify¬ 
ing the internal C residue (15, 155). Similarly, Trautner and 
coworkers have found that <f>3T and pll s also carry a 
second MTase; this one recognizes the sequence TCGA, 
and methylates the C residue (94). Although phage Z 
does not inherently contain a MTase gene, it does have 
homology with DNA that flanks MTases in related 
group III phages, and a functional MTase can be recom¬ 
bined into the Z genome from one of these other 
phages (126). 

Regardless of the specificity of the DNA binding sites, 
there are several features common to the phage MTases. 
The N-terminal end of the MTases is the site of S-adenosyl¬ 
methionine binding and possibly the catalytic site, while 
the C-terminal end is required for general DNA interaction. 
Between the conserved elements, there is a variable 
region responsible for binding to the specific target site in 
the DNA (51, 94). MTases with multiple sequence specifici¬ 
ties contain independent target-recognizing domains in 
a modular organization (70, 141). Evidence for the domain 
structure comes from isolation of mutant proteins that 
have lost one or more of the methylating activities, while 
the remaining methylase functions are normal (52). In addi¬ 
tion, chimeric MTases can be constructed from two dif¬ 
ferent MTases by “domain swapping” the target-specifying 
regions of the genes (5). 


Thymidylate Synthetase 

Most of the group III phages also carry the thyP gene, 
which encodes thymidylate synthetase (TSase), near the 
center of the phage genome (110, 117). The thyP 3 of 4>3T 
has been cloned, and the gene complements Thy auxotro- 
phy in both E. coli and B. subtilis (28). When the cloned 
thyP3 gene is transformed into B. subtilis 168, the DNA 
recombines at one of two chromosomal locations to yield 
Thy + prototrophs (123). One site is in the thyA gene, 
which is not surprising given the high degree of sequence 
similarity (97%) between the thyP3 and thyA genes (64, 
125). B. subtilis, unlike any other bacteria, has a second 
TSase gene, thyB, mapping near the SP(3 prophage and 
encoding a thermolabile TSase (90). However, the thyP3 
and thyB genes are less than 30% identical in their DNA 
sequences (60), and the thyP gene does not recombine in 
this region. 

The other site of thyP insertion is in the resident 
SP|3 prophage. SPP itself does not carry a thyP gene, but 
DNA sequences flanking the (j>3T thyP are homologous to 
a region in the center of the SPP genome (117). Hybrid SPP 
can be constructed that carry the thyP gene, the so-called 
SPPT phages (119). It is not clear why group III phages 
harbor the thyP gene, but the significant sequence simi¬ 
larity between thyP and thyA indicates a possible evolu¬ 
tionary link between these twoTSases. 

Lytic Proteins 

During the course of lytic growth, the host cell is lysed 
by phage-encoded enzymes to release the newly synthesized 
phage. Four such lytic protein genes have been identified 
in the PBSX late gene operon: xepA, xlilA, xhlB, and xlyA 
(66, 78). The xepA gene encodes an exoprotein whose pre¬ 
cise function is not yet known. xlyA encodes a 32 kDa endo- 
lysin resembling the cwlA-encoded bacterial amidase in 
B. subtilis (44). Degradation of cell walls by the PBSX 
amidase produces N-2,3-dinitrophenyl-L-alanine, indicating 
that it is an N-acetylmuramoyl-L-alanine amidase. It is the 
major enzyme for lysing the cell wall following PBSX induc¬ 
tion. xhlA and xhlB encode two polypeptides (89 and 87 
amino acids, respectively) that possibly interact to form a 
holin, which may play a role in exporting the amidase (see 
chapter 10 for review of holin and amidase functions). 

SPP also produces an N-acetylmuramoyl-L-alanine ami¬ 
dase, which is 40 kDa in size and is the product of the blyA 
gene (102). blyA is part of an operon that also contains 
bhlA and bhlB, which encode holin-like polypeptides of 
70 and 88 amino acids, respectively. This is the same gene 
organization seen in PBSX and may be an example of 
horizontal gene exchange that might occur during phage 
evolution (55). 

A mutant of 4>1()5, (j)105MU331, is a prophage vector 
that cannot lyse host cells (76). The phage contains a lacZ 
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reporter gene inserted into a putative holin gene. Without 
holin activity, the lytic enzyme is nonfunctional. 

Phages also encode other proteins that kill sensitive 
bacteria, such as betacin (54). The genes encoding betacin 
{bet) and tolerance to betacin (tol) are both present on 
the SP|3 genome, possibly encoded by yolG and yolH, respec¬ 
tively (74). Although the betacin protein has not been puri¬ 
fied, the putative protein deduced from the ijolG sequence 
is about 6 kDa, and the putative tol gene product resembles 
a protein ABC-transporter. 

PBSX functions as a phage-like bacteriocin, infecting 
and killing phage-sensitive cells. A phage-encoded protein, 
perhaps a tail protein, causes disruption of the cell wall 
and leads to lysis (120). 

Other Phage Genes 

Phages carry a variety of genes involved in DNA replica¬ 
tion or recombination. SP02 and IG1 encode their own 
DNA polymerases that, unlike the B. subtilis DNA polymer¬ 
ase III, are not inhibited by 6-(p-hydroxyphenylazo)- 
uracil (38, 109). SP02 also contains a DNA repair gene(s), 
allowing the phage to repair UV-induced damage even in a 
Uvr - host (45). 

As additional DNA sequence information becomes 
available, more phage-associated genes are identified. The 
SPP yoqy gene encodes an ATP-dependent DNA ligase, 
which is common to other phages. However, theYoqV ligase 
will not complement the activity of the essential NAD- 
dependent ligase encoded by ijerG, and is not essential to 
the cell because SPP nonlysogens are quite viable (99). 
Another SPP gene, yopP, is homologous to the bacterial codV, 
which encodes a recombinase that resolves chromo¬ 
somal dimers at the dif site during cell division (112). In a 
Cod - SPP lysogen, the phage recombinase reduces the 
frequency of defective chromosomal partitioning events. 

SPP also encodes a ribonucleotide reductase (RR), 
which is responsible for the reduction of ribonucleotides to 
deoxyribonucleotides. The two subunits of the enzyme are 
produced by the phage bnrdE and bnrdF genes, which are 
highly homologous to their B. subtilis counterparts, nrdE 
and nrdE (75). The phage genes differ from the cellular 
homologs in one crucial respect: each of the phage genes 
contains a group 1 intron, and bnrdE also contains an intein 
(48, 75). Although examples of introns, and more recently 
inteins, are well known in bacteria, archaea and eukaryotes, 
the intron-intein coincidence has not been observed before 
(23). In addition, Lazarevic reports that sequences in 
the approximately 330 bp intergenic region between bnrdE 
and bnrdF are similar to a eukaryotic splicesosome and may 
represent the remnants of another intron (72). Figure 35-6 
depicts the introns, intein, and possible splicesosome-like 
structures. 

Within the bnrdF intron there is a 522 bp ORF, yosQ, 
resembling an intron homing endonuclease, which is 


involved in the spread of introns between genes. The bnrdE 
intron does not contain such an ORF. However, the bnrdE 
intein has all the characteristic features of a homing endo¬ 
nuclease (75). In other RR genes from SPP-related phages 
in different Bacillus species, the same intron and intron- 
intein insertions occur, and a putative homing endo¬ 
nuclease also appears in the bnrdE—bnrdF intergenic region 
(73). A recent review article presents a detailed discussion 
of intron occurrence in bacterial and phage genes (27). 

Temperate Phage as Cloning Vectors 

Several cloning vectors have been derived from B. sub¬ 
tilis temperate phages, including <(>105, SPp, pll, and pl4 
(30, 31, 62, 65, 68). These vectors generate stable, single-copy 
insertions of the cloned genes into the host chromo¬ 
some. Two different methods have been developed for using 
phage vectors. The first is direct transfection of protoplasts 
by recombinant phage DNA, which has been quite suc¬ 
cessful with <(>105 (see figure 35-7A). 

The vector shown in figure 35-7A, <j>105J106, has unique 
BamHI, Xba I, and Sail restriction sites for cloning, but all 
essential phage genes are intact (26). Following digestion 
with Sail, the 4-base overhang (5'-TCGA...) is partially 
filled in with bases T and C to prevent self-ligation of 
vector “arms.” Similarly, insert DNA partially digested with 
Mbol has its 4-base overhangs (5'-GATC...) partially filled 
in with bases G and A. When vector and insert DNA are 
mixed, the remaining 2-base overhangs are compatible and 
the recombined molecules ligate together. Usefulness of 
this vector for “shotgun” cloning is limited because only 
about 5-6 kb of DNA can be inserted into the phage. 

(j>105 vectors have been constructed that promote effi¬ 
cient expression of specific recombinant genes (49, 127). 
These vectors contain a temperature-sensitive repressor 
protein, cts52, and have foreign gene expression under the 
control of a phage late gene promoter. One of these vectors, 
((105MU331, has the site of foreign gene insertion in a 
putative holin gene (76). Following heat induction, the 
heterologous gene is highly expressed from the phage 
promoter and because the host cells do not lyse, more of 
the protein can be produced. 

The second cloning method used is prophage trans¬ 
formation, whereby ligated phage and chromosomal DNA 
are introduced into competent lysogenic cells. The ligated 
DNA recombines into the prophage to generate special¬ 
ized transducing phage (see figure 35-7B). This method 
has been used more successfully with pll and SP(3 than 
<t> 105 vectors because the latter phage is induced during 
competence development (see above). The SP(3 vector 
shown in figure 35-7B, SP(3 c2 A2::Tn917, is heat inducible, 
has a 10 kb deletion (A2), and carries the erythromycin- 
resistance transposon,Tn927 (101). A plasmid vector, pCVl, 
carries the ends of Tn927 and SP(3 DNA flanking those 
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Figure 35-6 Introns, intein, and possible splicesosome in the bnrd region of SP(3. A: bnrdE intron, B: bnrdF intron, C: BnrdE 
intein, and D: possible intergenic splicesomal element. Numbers indicate nucleotide position in SP(f c2 prophage sequence 
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mechanisms (98). It is becoming more evident that these 
conserved mechanisms are no accident, but have resulted 
from a high degree of horizontal gene transfer between the 
phage genomes (56). 
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Figure 35-7 Cloning vectors constructed from phages. 

A: Direct transfection with phage c()105J106 and B: prophage 
transformation with phage SPp c2 A2::Tn977. Dark-gray bars 
represent insert DNA; light-gray bars represent SP|3 DNA; B is 
BomHI; X is Xbal; S is So/I; Ss is Sstl. 


ends on a pBR322-derived plasmid vector that replicates 
only in E. coli. The plasmid is cleaved into two fragments 
by Sstl and BomHI, heterologous DNA is inserted into 
the BomHI site, and long concatemers of vector-insert- 
vector DNAs are generated during ligation. The concatemers 
are transformed into the SP|3 c2 A2::Tn917 lysogen and, 
by homologous recombination, the cloned DNA is inte¬ 
grated into the prophage to generate a specialized trans¬ 
ducing phage. The recombinant phages are identified by 
selection for chloramphenicol resistance, and these drug- 
resistant clones are pooled and subjected to heat induction. 
To isolate clones carrying specific genes, the recombinant 
phage lysates are used to convert an auxotrophic nonlyso- 
gen to prototrophy. 

Conclusions 

Despite the diverse nature and unique features of the 
B. subtilis temperate phages, they have similarities to tem¬ 
perate phage isolated from another bacteria, for example in 
their genetic control. Some of their functions resemble 
those of B. subtilis lytic phage, such as their cell-lysis 
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L actococci are Gram-positive mesophilic bacteria with 
low G-C content that belong to the group of lactic 
acid bacteria. They are aerotolerant and live by means of 
fermentation, as they are lacking a respiratory chain, and 
the main end product during fermentation is lactic acid. 
In nature lactococci are mainly found on plant material 
(100). Some lactococci have apparently adapted to multiply 
in milk and contain a plasmid-encoded milk protease and 
enzymes responsible for catabolism of lactose. Lactococcus 
lactis is used world-wide as starter culture for large-scale 
milk fermentations producing cheese such as cheddar and 
other fermented milk products. It has been estimated that 
approximately 10' tonnes of cheese are made annually, lead¬ 
ing to human consumption of close to 10 18 lactococci (41). 
In the genus Lactococcus five species are currently found: 
L. lactis, L. garviae, L. plantarum, L. piscium, L. raffinolactis 
(101, 118). Of these only L. lactis is used as starter culture 
and only phages from L. lactis have been studied at the 
molecular level. L. lactis is divided into three subspecies. 
Of these only L. lactis ssp. lactis and L. lactis ssp cremoris are 
used for milk fermentations. They are distinguished by 
differences in their DNA sequences, including those encod¬ 
ing 16 S rRNA (45). 

The phylogenetic trees based on 16S rRNA identify 
streptococci as the closest relatives to L. lactis. In contrast 
to the many streptococcal species, lactococci are nonpatho- 
genic, food-grade bacteria and may even be beneficial to 
health. Their potential for new applications such as oral 
vaccines is being investigated (106). In the last decade an 
impressive amount of research has been conducted on 
L. lactis and it is now considered a model organism. As labo¬ 
ratory model strains, IL1403 (L. lactis ssp. lactis) and MG1363 
(L. lactis ssp. cremoris) are widely used. The 2.4 Mbp genome 
of IL1403 has been sequenced (12) and genome sequenc¬ 
ing of MG1363 is in progress (62). Genetic techniques and 
tools such as transduction, conjugation, transformation, 
and transposon mutagenesis are available (39). Natural 
competence has never been described for lactococci despite 


all the necessary competence genes having been found 
on the IL1403 chromosome (12). 

The industrial use of L. lactis in vast amounts for milk 
fermentations is providing a gigantic large-scale environ¬ 
ment for phage reproduction and evolution. Phage infec¬ 
tions are difficult to avoid, since pasteurized milk is not 
sterile and may contain phages that potentially destroys 
the fermentation and hence the product. Furthermore, the 
use of defined strains as starter cultures has greatly limited 
the number of industrial strains used and hence made it 
easier for phage contaminants to proliferate. After the dairy 
production failures in the mid-1930s were recognized to 
be caused by phage infections (117), research into phages 
and phage resistance has been a major issue in the lactococ- 
cal field. Phage resistance systems, however, are outside 
the scope of this chapter; for reviews see (2,40). 

In 1949 it was discovered that L. lactis could be lyso¬ 
genic (98). Later, lysogeny was found to be a very common 
phenomenon in L. lactis strains (47, 51, 53). In accordance 
with this, several prophages were discovered when the 
genome of the model strain L. lactis ssp. lactis IL1403 was 
sequenced (30). The majority of phages infecting L. lactis 
belong to the Siphoviridae family and carry a long, noncon- 
tractile tail, while a few belongs to the Podoviridae family 
and have a very short tail. Among the lactococcal phages 
only morphotypes Bl, B2, C2, and C3 have been observed 
(table 36-1; see also chapter 2 for a general discussion of 
phage classification). At present only phages containing 
double-stranded DNA genomes have been identified. On the 
basis of morphology and DNA hybridizations (55) lactococcal 
phages were divided into 12 phage species. Among these, 
phage species 1483,1358, and oT187 have later been assigned 
to the P335 phage species (30, 54, 83). We have further¬ 
more included phage BK5-T in the P335 species (30). The 
remaining eight species, their type phage, and some exam¬ 
ples of specific phage members are shown in table 36-1. 
Over the years many lactococcal phages have been isolated 
world-wide and more than 80% of all lactococcal phages 
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Table 36-1 Taxonomy of Lactococcal Phages as Modified from (55) 


Family 

Morphotype 3 

Phage species 

Type phage 

Members 

Siphoviridae 

B1 

936 

P008 

P008, F4-1, ski, 4>US3, blL41, blL66, 





bILI70, uclOOl, ucl002, 712 


B1 

P335 b 

P335 

031, 050, ul36, rlt, OLC3, TP901-1, Tuc2009, C3-T1, 

Q30, Q33, 7-9, 1483, 1358, 4>T187, blL285, blL286, blL309, 
TPW22, 4268, BK5-T C 


B1 

PI 07 

PI 07 



B1 

P087 

P087 



B1 

949 

949 



B2 

c2 

c6A 

c2, c6A, blL67, bILI 88, 4>vML3, P001, P6, 4>197 

Podoviridae 

C2 

P034 

P034 

P034, ascc4>28 (1. B. Powell, personal comunication) 


C3 

KSY1 

KSY1 


a For a description of morphotypes see 

chapter 2. 



b For further subdivision of the P335 species see table 36-5. 



C BK5-T was originally proposed to be e 

i separate phage species. 



Table 36-2 lactococcal Phages with Fully Sequenced Genomes 

Phage species 

Phage 

Type 

Ends 

Genome size (bp) 

Reference 

Accession no. 

c2 

blL67 

Virulent 

COS 

22,195 

(102) 

L33769 


c2 

Virulent 

cos 

22,172 or 22,163 

(71) 

L48605 

936 

ski 

Virulent 

cos 

28,451 

(27) 

AF011378 


bILI 70 

Virulent 

cos 

31,754 

(34) 

AF009630 

P335 

rlt 

Temperate 

cos 

35,550 

(111) 

U38906 


TP901-1 

Temperate 

pac 

37,667 

(23) 

AF304433 


Tuc2009 

Temperate 

pac 

38,347 

(36) 

AF109874 


blL285 

Temperate 


35,538 

(30) 

AF323668 


blL286 

Temperate 


41,834 

(30) 

AF323669 


blL309 

Temperate 


36,949 

(30) 

AF323670 


BK5-T 

Temperate 

cos 

40,003 

(35, 78) 

AF176025 


ul36 

Virulent 


36,798 

(65) 

AF349457 


4268 

Virulent 


36,596 


AF489521 

Unknown 

blL310 

Temperate 


14,957 

(30) 

AF323671 


blL311 

Temperate 


14,510 

(30) 

AF323672 


blL312 

Temperate 


15,179 

(30) 

AF323673 

P034 

Ascc(|)28 

Virulent 


18,762 

1. B. Powell, 
personal 







communication 



isolated from the dairy environment belong to phage 
species 936, P335, and c2 (52, 60, 97). Due to their indus¬ 
trial importance, lactococcal phage research has been 
focused on these three phage species, and 16 of the 
17 fully sequenced genomes belong to these species 
(table 36-2). In this chapter we will therefore concentrate 
on phages belonging to the c2, 936, and P335 phage species 
with the main emphasis on the sequenced members of 
these phages. 

Phage Species c2 

General Description 

This group of phages consists of prolate-headed phages 
with long, noncontractile tails. So far only virulent cos 


phages with genome sizes from 18 to 22 kb have been 
assigned to this phage species. Phage bIL67 was the first 
lactococcal phage to be sequenced (102) and it was later 
found that bIL67 and phage c2 are highly related, since the 
two phage genomes share 80% nucleotide sequence iden¬ 
tity (71). Genome sequencing of the c2 phage identified 39 
open reading frames (ORFs) organized in two clusters of 
divergent orientations (figure 36-1) (71). 

Transcription 

Transcriptional analysis of the phage c2 genome during 
one-step growth showed two classes of genes: early and 
late (figure 36-1). Early mRNAs were found 5 and 10 minutes 
after infection as well as later in infection, while late mRNAs 
were found from 15 minutes onwards until cell lysis at 
45 minutes (5). This temporal gene expression corresponds 
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Virulent phage c2 (c2 species) 




ori 


early middle late 

Figure 36-1 Genome organization of virulent phages c2 and ski. The DNA sequence of c2 is obtained from (71), accession 
number L48605, while the DNA sequence of phage ski is obtained from (27), accession number AF011378. Arrows indicate 
the size and direction of transcription of open reading frames (ORFs). Assigned functions are written above the ORFs. 
Identified and putative promoters are indicated as small black arrows, while origins of replication are shown as a black box 
below the ORFs. The locations of cos sites are indicated. Regions of the genome transcribed early, middle, or late in the 
infection cycle are marked with black lines below the genome. 


to the two gene clusters found by sequence analysis (7T). 
Six early and one late promoter were also suggested by 
sequence analysis, and the mRNA ends were verified by pri¬ 
mer extension analysis according to unpublished results 
from the Jarvis group. 

Gene Functions and Origin of Replication 

Amino acid similarity suggests that phage c2 encodes 
a DNA polymerase by three successive ORFs (e 5 to e7), 
whereas regulatory proteins, el2 and e22, were proposed 
due to the presence of a DNA-binding or a sigma-like motif, 
respectively. Finally, the eT5 protein shows similarity to 
the recombination protein. Erf, of the Salmonella phage 
P22 (figure 36-1) (the biology of phage P22 is reviewed 
in chapter 29). Generally the ORFs identified in the early 
region were small in size and the possibility for creation of 
larger reading frames through the mechanisms of frame 
shifting has been discussed (71). 

The identity of several of the structural genes in the 
late gene cluster was accomplished by a combination of 
SDS-PAGE, N-terminal sequencing, western blot, and immu- 
nogold electron microscopy of c2 phage particles (71). Gene 
products L5, L7, and L10 were thus identified as the 
major head, major tail, and putative tail adsorption pro¬ 
tein, respectively. Analysis of several major virion protein 


bands with identical N-termini showed that the major 
head protein of c2 most likely is both processed proteo- 
lytically and covalently linked in protein complexes with 
molecular weights corresponding mainly to trimeric and 
hexameric forms of the processed major head subunit (71). 
Furthermore, a proteolytic cleavage site of the major head 
protein was predicted by computer analysis (35). A minor 
processed form of the major tail protein (L7) was also 
observed. Terminase function was assigned to L12 on the 
basis of domain homology (71). L3 had previously been 
identified as the phage lysin (115). A possible holin func¬ 
tion was assigned to L17 on the basis of possession of 
two predicted transmembrane domains and a charged 
C-terminus (see chapter 10 for a review of holin structure 
and function), while L2 located immediately upstream 
of the lysin gene was suggested to encode a structure- 
dependent endonuclease activity necessary for maturation 
and packaging of phage DNA, as shown for phage bIL66 (7). 
A 521 bp fragment of the noncoding region located between 
the divergently transcribed gene clusters was shown to 
act as an origin of DNA replication in L. lactis when cloned 
in an origin screening vector, suggesting that this region 
harbors the origin of phage c2 replication (116). Inves¬ 
tigation of replicating intermediates of c2 confirmed 
this, the data furthermore revealed a theta replication 
mechanism (24). 
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Relationship Between c2 and blL67 

Analysis of the DNA sequences of c2 and bIL67 showed 
that the sequences are almost identical (35, 71). Only 
three regions showed nonalignment: one corresponds 
to the origin of replication, another was located in a central 
domain in the putative tail adsorption gene, and the 
third covered a region encoding minor structural proteins 
(L14, L15, and L16 in c2) that, despite their nucleotide- 
sequence divergence, retained significant amino acid 
similarity to the corresponding proteins from phage bIL67. 
The L15 and L16 proteins were found to share signifi¬ 
cant amino acid similarity with putative tail adsorption 
proteins from both cos- and pac-site Streptococcus thermo- 
philus phages. Experimental evidence has now been 
provided showing that the host range of both bIL67 and 
also another prolate phage is determined by protein 35, 
corresponding to the L15 protein of c2 (108) (figure 36-1). 
L16 has been identified as part of the collar structure of 
a c2 derivative possessing a collar. The collar-containing 
c2 phage harbors an additional gene (col) that is down¬ 
stream of L16 and the presence of both col and L16 seems 
to be necessary for collar formation. Collarless c2 phages 
arise spontaneously from the collared form, presumably 
by homologous recombination between the L16 and col 
genes (I. B. Powell, personal communication). 

Phage Receptors 

The only lactococcal phage receptor characterized at the 
molecular level is the membrane protein Pip (phage infec¬ 
tion protein) that was required for infection by prolate¬ 
headed phages of the c2 species (44). It was suggested 
that phage absorption is a two-stage process involving a 
reversible adsorption to carbohydrate components in the 
cell wall followed by an irreversible interaction with Pip 
(84). Lactococcal host cells containing a deletion of the pip 
gene were found to be resistant to phages of the c2 species. 
Furthermore, no growth defects were observed (42). In 
contrast, Pip was found not to be required for infection by 
phages of the 936, P335, and 949 species (63). The corre¬ 
sponding phage protein (the antireceptor) involved in inter¬ 
action with the Pip protein was identified for two of the 
prolate-headed c2 phages as mentioned above (108). 


936 Phage Species 
General Description 

The 936 phage species contains small isometric-headed 
phages with short tails. Genome sizes range from 25 to 
40 kb and at present only virulent cos phages have 
been identified. So far only one type of cos site has been 
identified (11 bp 3' overhang CACAAAGGACT) (28, 91, 


95, 99). The major parts of sequenced regions of 936 phages 
have been found to be highly similar and two members of the 
936 species have been fully sequenced: phages ski (27) and 
bIL170 (34). As described in (35), their genomes could be 
aligned essentially over their entire lengths. The exceptions 
were mainly insertions in either one or the other phage. 
Since more functional studies have been performed on ski, 
this phage will be used as model phage for the 936 phages. 
The genome of ski is organized in three gene clusters 
(figure 36-1). The two divergently located gene clusters 
mainly contain small genes of unknown function, while 
the remaining 20 genes encode proteins involved in DNA 
packaging, morphogenesis, and cell lysis (27). 

Transcription 

Transcriptional analysis of ski divided the gene expres¬ 
sion during the lytic cycle into early, middle and late tran¬ 
scripts; the early transcripts were repressed in the later 
phases of the lytic cycle (26). The early region covers 
the large gene cluster consisting of 30 small ORFs, the 
middle region covers four small ORFs, and the late region 
consists of the 20 ORFs located downstream of the cos site 
(figure 36-1) (27). 

In the early region seven partially overlapping trans¬ 
cripts were found, and the 5' end of three of these (El, E5, 
and E7) was mapped by primer extensions. Promoter activ¬ 
ity of PEI and PE 7 was furthermore verified by promoter 
cloning. In addition, site-directed mutagenesis was shown 
to destroy promoter function of PEI and PE 7 (27). Sequences 
showing some similarity to lactococcal consensus pro¬ 
moters could be found in four additional intergenic regions 
in the early gene cluster. Promoter activity, however, was not 
demonstrated. 

The cloned early promoters were active in the absence 
of phage proteins; in contrast a middle promoter (P M ) was 
identified, which was only functional in the presence of 
phage proteins. The DNA region necessary for P M activity 
was further defined by deletion analysis and mutagenesis 
studies. The deletion analysis demonstrated that the region 
located at —46 to —55 upstream of the transcriptional 
start site was necessary for promoter activity and mutagen¬ 
esis studies expanded the region necessary for promoter 
activity to cover from —36 to —55 (27). An identical P M 
promoter had earlier been identified in phage bIL66 (6). 

Northern analysis during ski infection showed the 
presence of seven early mRNAs, nine small middle mRNAs, 
and four large late mRNAs. The largest of the late trans¬ 
cripts, estimated in the range of 14 kb, may cover all the late 
genes, while the remaining late mRNA species are overlap¬ 
ping and probably are processed forms of the large primary 
transcript. A large 16 kb transcript from the late promoter 
in bIL41 has been demonstrated to cover all the late genes. 
The partially overlapping smaller late transcripts were 
suggested to be products from partial termination or from 
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processing of the larger transcript. Possible RNase E sites 
were found at the end of some of these mRNAs (91). 

Gene Functions and Origin of Replication 

In the early region of the ski genome many of the small 
ORFs are overlapping and the opportunity for larger ORFs 
by programmed frame shifting of translation was sugges¬ 
ted for some of these (27), as has been observed for other 
phages, such as for X (67). Phage ski ORF43 and 0RF44 
proteins showed similarity to a protein found in the prolate¬ 
headed lactococcal phages: the proposed DNA poly¬ 
merase subunit encoded by e5 from phage c2 and orf3 
from phage bIL67. Although the genes flanking the c2 
and bIF67 homologs of ski gene 43/44 are also annotated 
as DNA polymerase subunit genes, no matches to these 
flanking DNAP genes of c2 and bIL67 are found in ski. 

Sequence analysis of the encoded proteins in ski have 
identified the small terminase subunit, the large terminase 
subunit, portal protein, major head protein, major tail 
protein, tape measure protein, holin, and lysin to ORFs 1, 
2, 4, 6, 11, 14, 19, and 20, respectively (figure 36-1) (19, 27). 
It was noted that the gene order in the structural genes of 
ski and X was highly conserved except for the inverted 
order of 0RF9 and ORF10, as well as for ORF17 and ORF18. 
In addition, no homologs of X genes D, /, and Rz could be 
found (27). See chapter 27 for a review of these phage X 
genes and gene order. 

The function of the holin and lysin from ski was 
shown by the damage caused by induction of these ORFs 
in Escherichia coli (27). The major tail protein homolog had 
previously been demonstrated to be an abundant struc¬ 
tural protein of the L. lactis ssp. cremoris phage F4-1 (33). 
In contrast to the c2 phage species, no evidence has been 
found for proteolysis of the major head protein in ski and 
other 936 phages, either experimentally or from analysis 
of the amino acid sequence (35). 

The origin of replication was localized in the early 
region to an 800 bp fragment of ski, which enabled plasmid 
replication in L. lactis (figure 36-1). This minimal origin of 
replication covered the intergenic region between orf47 and 
or/48, including the PEI promoter and the N-terminal 179 
amino acids of protein ORF47, which has a total length of 
231 amino acids. Three pairs of direct repeats and a pair of 
67 bp repeated sequences were identified in the intergenic 
region (27). Currently it has not been determined whether 
it is the protein part of ORF47 versus DNA sequences 
within orf47 that are required for replication. 

Interaction with Abortive 

Infection Mechanisms 

A region of phage bIF66, showing high similarity to ski 
(85% identity), was shown to be a target for the AbiDl 
abortive infection locus. The region is expressed from the 


P M promoter and contains ORF1 through 0RF4 corre¬ 
sponding to ORF51 through ORF54, respectively, of ski 
(figure 36-1). Mutants of bIF66 overcoming the inhibition 
by AbiDl were shown to be located in orfl. ORF2 and ORF3 
of bIF66 have been shown to cause double-stranded DNA 
breaks in branched DNA structures. ORF3 furthermore 
has amino acid similarity to the E. coli RuvC Holliday junc¬ 
tion resolvase. It was therefore suggested that ORF2 and 
ORF 3 form a structure-dependent endonuclease necessary 
for maturation and packaging of the phage DNA (7). The 
current hypothesis is that AbiDl acts in conjunction 
with ORF1 to reduce the amount of ORF3 below that 
required for normal phage development (7). It was proposed 
that translation was reduced, since transcription was not 
affected (6). The RuvC homolog was further identified in 
the prolate phages c2 (F2) and bIF67 (ORF23). Interestingly, 
AbiDl also inhibits proliferation of the prolate-headed 
phages (3). 

A transcriptional study during the lytic cycle of bIL170 
confirmed the findings from ski of early, middle, and late 
expressed regions (90). The sensitivity of this phage to the 
abiB abortive infection locus was furthermore exam¬ 
ined. Transcriptional analysis showed that when AbiB- 
containing host cells where infected with bIF170, an RNase 
was produced, destroying phage mRNA. The results sug¬ 
gest that an early-expressed phage protein in combination 
with AbiB forms or activates this RNase (90). It may be 
noted that phage bIF41, in contrast to phage bIF170, is 
resistant to the abiB abortive infection locus. 

Difference in sensitivity against an Abi system has 
also been found for the 936 phages, ski and 712. A 324 bp 
region covering the cos-site from the resistant ski phage 
cloned in a multicopy plasmid was shown to protect 
the sensitive 712 phage from the abiF abortive infection 
locus (99). 

P335 Phage Species 

General Description 

Phages of the P335 phage species are small isometric¬ 
headed phages with genomes ranging from approxi¬ 
mately 30 to 42 kb. In this group of phages cos or pac sites 
are used during initiation of DNA packaging and as the 
only lactococcal phage species it contains both virulent 
and temperate phages. Nine genomes, including BK5-T, of 
both temperate and virulent members are completely 
sequenced, including three full-size prophages located 
in the L. lactis ssp. lactis IL1403 genome (table 36-2) (30). In 
addition, three satellite prophages from IL1403 (table 36-2) 
and parts of many other phage genomes are available from 
the databases. The genetic organization of the P335 phage 
genomes is highly similar (18, 23, 35, 65, 112). The tem¬ 
perate phages all have a small lysogenic operon and a large, 
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Figure 36-2 Genome organization of temperate phage TP901-1. Sequence is obtained from (23), accession number 
AF304433. Arrows indicate the size and direction of transcription of open reading frames. Assigned functions are written 
above the open reading frames. Identified promoters are indicated as small black arrows, while the origin of replication 
is shown as a black box below the open reading frames. Regions of the genome transcribed early, middle, or late in the 
infection cycle are marked with black lines below the genome. 


divergently located cluster of genes involved in lytic growth, 
whereas the virulent phages have similar organizations, 
only lacking the lysogenic gene cluster or parts of it 
(figure 36-2). Due to the similarity in genetic organization 
and the availability of many biological data from several 
different phages, the knowledge of P335 phages will be 
presented as functional modules rather than as isolated 
overviews of individual phages. 

Transcriptional Patterns 

Transcription of the temperate phage TP901-1 genome 
(figure 36-2) was analyzed by northern blot during the lytic 
cycle. It was found that sequential clusters of the genome 
were temporally transcribed and that the genome can 
be divided into early, middle, and late expressed regions 
(75). Eight early transcripts are present at maximum level 
10 minutes after infection. At least seven of these origi¬ 
nate from the two identified early promoters. The four 
middle transcripts, observed at maximum level 30 minutes 
after infection, are located in the region encoding orf24 
to orf29, which is also transcribed in the early phase of 
the lytic cycle. The remaining genes (or/30 to or/56) are 
transcribed late in the lytic cycle (23, 75). 

Site-Specific Recombination 

Most phage integrases belong to the lambda family of 
integrases; only a few integrases are members of a new 


family showing homology to the catalytic site of resol- 
vases and invertases (31, 64, 79, 109). TP901-1 encodes a 
resolvase-type integrase (group II in table 36-3), while the 
remainder of the investigated lactococcal phages contain 
an integrase belonging to the X family (groups I and III in 
table 36-3). The TP901-l-encoded resolvase-type integrase 
catalyzes site-specific recombination in L. lactis (31), in 
E. coli (17), and in human cells (107). 

Site-specific integration in L. lactis occurred into 
a single chromosomal site, attB (table 36-4), in both the 
host and the non-host strain MG1363 (31, 32). The frequency 
of lysogenization was determined to be 2-5% of the infected 
cells, using an erm-labeled phage (61). Furthermore, a 
TP901-1 mutant containing a deletion of part of the inte¬ 
grase gene and or/2 showed the expected, dramatic reduc¬ 
tion in lysogenization (61). 

Integration vectors that are useful for study of pro¬ 
moter fusions in a single copy on the chromosome have 
been constructed based on nonreplicating plasmids con¬ 
taining attP from TP901-1 and a selectable marker. These 
integration vectors integrate very efficiently with a fre¬ 
quency similar to the frequency of plasmid transformation 
(21). The smallest regions of attP and attB that are sufficient 
for site-specific recombination were identified to 56 and 
43 bp, respectively (17, 21) and the integrase binds with 
equal affinity to the two linear DNA fragments in vitro 
(17). A 5 bp sequence was identified as the core sequence 
(32) (table 36-4) and mutant analysis verified that 
the TC dinucleotide constitutes the overlap region (17). 


















Table 36-3 Similarity of Selected Functions Within the P335 Phages 


Function 

Croup 1 

Croup II 

Croup III 

Croup IV Croup V 

Integrase 

rlt (ORF1) 

blL285 (ORF1: 98%) 

<I>LC3 (INT: 97%) 

BK5-T (ORF33: 97%) 
Tuc2009 (ORF1: 97%) 
blL309 (ORF1: 40%) 
TPW22 (INT: 27%) 

TP901-1 (ORF1) 

blL286 (ORF1) 

blL309 (ORF1: 25%) 
ul36 (ORF359: 99%) 


Super infection 
exclusion 

rlt (ORF2) 

Tuc2009 (SIE 2009 : 100%) 
<I>LC3 (ORF173: 100%) 

TP901-1 (ORF2) 

blL285 (ORF2: 98%) 
blL309 (ORF2: 66% C) 

BK5-T (ORF34) 


Repressor 

rlt (RRO) 

blL309 (ORF3: 97%) 

<£LC3 (ORF286: 79%) 
Tuc2009 (ORF4: 78%) 
BK5-T (ORF35: 79%) 

TP901-1 (ORF4) 

blL285 (ORF4: 98%) 

031 (ORF180: 68% N) 

4268 (ORF1: 67%N) 
ul36 (ORF188: 53%) 
blL285 (ORF4: 32%) 



Cro-like 

rlt (TEC) 

blL309 (ORF7: 100%) 

TP901-1 (MOR) 

blL285 (ORF5: 98%) 

031 (ORF74: 50%) 
ul36 (ORF74A: 50%) 

4268 (ORF2: 50%) 

BK5-T (ORF37) 

Tuc2009 (ORF5) 

<TLC3 (ORF76: 100%) 

Antirepressor 

rlt (ORF5) 

BK5-T (ORF38: 96%) 
blL309 (ORF8: 40% N) 

TP901-1 (ORF6) 

blL285 (ORF6: 100%) 
blL286 (ORF6: 65% C) 

OLC3 (ORF236: 95% C) 
Tuc2009 (ORF6: 94% C) 
031 (ORF238: 94% C) 
031.1 (ORF238: 94% C) 
ul36 (ORF238: 94%C) 

4268 (ORF3: 94%C) 



Excisionase 

rlt (ORF6) 

<HLC3 (ORF110: 99%) 
Tuc2009 (ORF7: 97%) 
blL285 (ORF7: 97%) 

BK5-T (ORF39: 97%) 
ul36 (ORFIIIa: 97%) 
ul36.1 (ORFIIIb: 97%) 
4268 (ORF4: 95%) 
blL309 (ORF9: 75%) 

TP901-1 (ORF7) 



Replication protein 

Tuc2009 (ORF16) 

blL285 (ORF16: 99%) 
ul36 (ORF255: 99%) 
rlt (ORF11: 30% N) 
blL309 (ORF14: 54% C) 
4268 (ORF11: 53%C) 
blL286 (ORF16: 24% C) 

TP901-1 (ORF13) 

BK5-T (ORF49: 27% C) 
ul36.1 (ORF235: 33% C) 
031.1 (ORF269: 33% C) 

031 (ORF492) 


FHolliday junction 
resolvase 

rlt (ORF14) 

blL285 (ORF18: 99%) 
ul36 (ORF129: 98%) 
Tuc2009 (ORF19: 97%) 
BK5-T (ORF51: 97%) 

TP901-1 (ORF15) 

031.1 (ORF139: 97%) 
ul36.1 (ORF109: 76%) 
blL309 (ORF17: 32%) 



Large terminase 
subunit 

rlt (ORF27) 

031 (ORF5: 97%) 

<I>LC3 (ORF180’: 97% N) 

TP901-1 (ORF31) 

ul36 (ORF462: 97%) 
Tuc2009 (ORF31: 66% N) 
Tuc2009 (ORF32: 97% C) 

BK5-T (ORF2) 

blL286 (ORF41: 99%) 
4268 (ORF31: 72%) 
blL309 (ORF39: 34%) 
blL285 (ORF41: 23%) 



(Continued) 


578 



PHAGES OF LACTOCOCCUS LACTIS 579 


Table 36-3 Similarity of Selected Functions Within the P335 Phages 

Function Group I Group II Group III Group IV Group V 

Major head protein rlt (ORF31) TP901-1 (ORF36) BK5-T (ORF7) blL309 (ORF43) 

Tuc2009 (ORF37-39: 98%) 4268 (ORF35: 97%) blL285 (ORF44: 26%) 

blL286 (ORF45: 96%) 
blL285 (ORF44: 23%) 

Lysin rlt (ORF49) TP901-1 (ORF53) BK5-T (ORF27) 4268 (ORF49) 

blL285 (ORF62: 98%) 
ul36 (ORF429: 96%) blL286 (ORF61: 97%) 

<I>LC3 (LYSB: 95%) blL309 (ORF56: 97%) 

TPW22 (ORFA: 94%) <D31 (LYS: 98% N) 

Tuc2009 (ORF57: 93%) 

For each group the identity of proteins encoding the same function is indicated as a percentage of the type protein encoded by the phage marked in bold. C, 
Similarity only in the C-terminal; N, similarity only in the N-terminal. Proteins showing high similarity are placed in the same group; however, some similarity 
may be found between groups, for example repressor groups I and II. In addition, some proteins may belong to the same protein family without being in the 
same group, for example integrase groups I and III that contain A,-like integrases. 


Table 36-4 Attachment Sites of Temperate P335 Phages 


Phage 3 

Size of core sequences 
Sequence 5' to 3' 

(if smaller than 15 bp) 

location of attB b 

Reference 
to attB 

(I>LC3, Tuc2009, rlt 

9 bp TTCTTCATG 

Noncoding (IL1403) C 

(30) 

BK5-T, blL285 


C-terminal of ORF (MG1363) C 

(68) 

blL286 

80 bp 

Noncoding (IL1403) 

(30) 

blL309 

29 bp 

tRNA Arg , but intact gene 

preserved in lysogen (IL1403) 

(30) 

TPW22 

14 bp TAAGGCGACGGTCG 

Orf, disrupted in lysogen 
(ligase-like gene) 

(MG1363, 3107 and W22) 

(94) 

TP901-1 d 

5 bp TCAAT 

Orf125, disrupted in lysogen 
(competence-like: comCC 
in B. subtilis) (MG1363 and 3107) 

(17) 


a AII except TP901-1 contain /.-like integrases. 

b Strains used for attB determination are indicated in parentheses. 

c The core sequences for <SLC3 and blL285 were found to be located differently in the strains MCI 363 (changing the last 5 amino acids in an ORF) and IL1403 
(located after the homologous ORF, which was found to be shorter than in MCI363). Sequence information from MCI363 was obtained from M. Lunde. 
d The overlap region was determined to TC (17). 


Thus, mechanistically the integrase from TP901-1 so replication. However, the number of integrants obtained 
far behaves as other members of the resolvase family using a nonreplicating plasmid suggests that the inte- 
(105,109). gration efficiency is in the order of 0.2%. A similarly low 

Amongst the X-type integrases almost identical pro- frequency of integration was obtained for the identical 
teins were found in phages €>LC3, Tuc2009, rlt, bIL285, integration system of phage Tuc2009 (110). It is unsolved 
and BK5-T (table 36-3). Accordingly, identical attP and attB whether both systems contain insufficient promoters for 
core regions of 9 bp have been determined for these integrase expression or whether the observed numbers 
phages (table 36-4). Site-specific integration catalyzed by instead reflect rather inefficient integration systems. The 
the OLC3- and the Tuc2009-encoded integrases have frequency of lysogenization by 4>LC3 has been estimated to 
furthermore been demonstrated in the non-host strain, be 0.01% of the cells by measuring the number of attR 
MG1363 (68, 110). An isolated ®LC3 int mutant could sequences by quantitative polymerase chain reaction (73). 
not form stable lysogens. Integrase activity, however, could Furthermore, <DLC3 prophage stability varied with tempera- 
be complemented in trails by a clear-plaque mutant of <DLC 3 ture and growth phase of the host (74). However, the low 
(68). An integration vector based on theattPsite and iiitfrom frequency of lysogenization and the impact of the growth 
phage OLC3 has been constructed (70). The integration conditions of the host on prophage stability may also reflect 
was highly efficient using a temperature-sensitive origin of properties of the genetic switch. 


ul36 

(ORF287) 
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Figure 36-3 Genetic organization of the lysogeny module of 
temperate P335 phages. Identified and putative functions 
are indicated at the top of the figure and genes are aligned 
according to this. Integrase, int; superimmunity exclusion, 
sieA; part of Sie system and metalloproteinase motif protein, 
sieB ; cro-like, cro; antirepressor, ant; excisionase, xis. Genes 
encoding similar proteins are shown as arrows with identical 
pattern. No similarity to other phage proteins, white; 
similar proteins with no identified function, black. Small 
black arrows indicate promoters identified by biological 
experiments, while locations of identified repressor operator 
sites are shown as small black boxes. 


Table 36-4 summarizes the core regions and the locations 
of the attB sites for the lactococcal phages. The attB sites 
for several of the phages were localized in noncoding regions 
of the chromosome and hence no gene disruption occurs 
by lysogenization. Exceptions were TP901-T and TPW22 
lysogens that both contain a disrupted ORE and phage 
bIL309 that inserts into a tRNA Arg gene, resulting in 
preservation of an intact gene (table 36-4). 

Excision of a TP901-T-based integration vector is depen¬ 
dent on a functional integrase. However, the frequency of 
excision is very low when the integrase is the only phage 
protein present. In contrast, T00% excision is found when 
ORF7, encoding the excisionase, is provided as along with 
the integrase (16). Excisionases have not been published 
for other lactococcal bacteriophages; however, a phage 
integrase and excisionase are expected to co-function 
during excision of the integrated prophage (as reviewed in 
chapters 7 and 8). Thus, a homologous integrase implies 
a homologous excisionase and comparison of proteins 
encoded by the early region of phages containing essentially 
the same integrase (rlt, Tuc2009, bIL285, BK5-T) revealed 
only one highly conserved protein present in all these 
phages (figure 36-3). We suggest that this small protein is 
the excisionase of these phages (rlt: ORF6; Tuc2009: 
ORF7; bIL285: ORF7; BK5-T: ORF39) (table 36-3). Data from 
S. Leach et al. (personal communication) confirm this with 


respect to ORF7 from Tuc2009. Phage bIL309, which 
contains an integrase 40% identical to the above-mentioned 
integrase, encodes a protein showing 75% identity to the 
proposed excisionase and it is therefore likely that this 
protein (ORF9) is the excisionase of phage bIL309 
(figure 36-3, table 36-3). 

The Genetic Switch 

A characteristic feature of a genetic switch in temperate 
phages is a phage-encoded repressor able to repress early 
promoters controlling expression of genes necessary for 
phage development leading to the lytic cycle. Another 
feature is a factor counteracting repression, such as the Cro 
protein in phage X which is able to repress expression of 
the Cl repressor and hence is needed for the choice leading 
to lytic development (reviewed in chapter 8). 

Two classes of repressors have so far been identified 
in lactococcal phages: the rlt and the TP901-1 classes 
(table 36-3). Sequence and folding analysis of repressors 
from both classes placed them in the HTH-3 family of 
DNA binding proteins due to a structure in the N-terminal 
end of the repressor protein. The C-terminal ends are. in 
analogy with Cl from X, presumed to harbor the regions 
responsible for repressor subunit-subunit interactions. 
The difference between the two classes of lactococcal 
repressors is their size (278 and 180 amino acids for rlt 
and TP901-1, respectively), as well as the lack of sequences 
for RecA-mediated autodigestion in the TP901-1 class of 
repressors (figure 36-4). It should be mentioned that all 
characterized temperate lactococcal phages nevertheless 
are inducible by mitomycin C, including those having the 
TP901-1 type of repressor. 

Comparisons of the amino acid sequences of the rlt- 
like repressors (rlt, bIL309, ®LC3,Tuc2009, and also BK5-T; 
table 36-3) revealed that approximately two thirds of 
the total amino acid sequence was identical, while less 
sequence similarity was found in the N-terminal domain. 
The TP901-l-like repressors show approximately 33-36% 
identity to the phage rlt-like repressors but only in a short 
region (figure 36-4). This region might be involved in 
protein-protein interactions with host proteins, for example 
RecA. A characteristic feature is the distance between the 
start sites of the lytic and lysogenic promoters (approxi¬ 
mately 100 bp) and the presence of two operator sites 
in the promoter region and often also one operator site 
overlapping the first gene of the lytic operon (figure 36-3). 

The repressor Rro from phage rlt was the first lactoc- 
coccal phage repressor to be analyzed. Three palindromic 
operator sites were identified by gel retardation studies. 
Primer extensions identified two promoters: PI and P2 
(figure 36-3) (87). The expression from the lytic P2 promoter 
was studied using a translational fusion to lacZ. In the fusion 
plasmid containing both rro and tec, P2 expression was 
shown to be inducible by mitomycin C in the non-host 
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Figure 36-4 Comparison of repressors of P335 temperate phages TP901-1 and rlt. Position of helix-turn-helix (HTH) motifs 
representing putative DNA binding domain are indicated with different shading in TP901-1 and rlt, while the conserved 
motif found in all lactococcal phage repressors is indicated as a black box. For rlt the position of the mutations leading to 
a temperature-sensitive repressor (ts) and the putative signature seguence for RecA-catalyzed autodigestion (AG) are also 
indicated. 


strain L. lactis ssp. cremoris MG1363, thus mimicking the 
behavior of the prophage. Introduction of a 5'-located frame- 
shift mutation in rro was expected to result in constitu¬ 
tive expression of the lytic promoter. However, gradually 
more and more derepressed P2 expression during growth 
of the bacterium was observed and this phenomenon was 
not explained. Temperature-sensitive mutants in the Rro 
repressor were isolated by random mutagenesis of three 
amino acids — F50, V54, and P61 — which were selected 
after folding of the N-terminus of Rro after the phage X Cl 
structure and using the information from the temperature- 
sensitive A66T mutation in this protein (presumed to corre¬ 
spond to V54 in Rro) (86). These results strongly suggest 
that the DNA binding domain of Rro from phage rlt, 
as in X, is composed of a helix-turn-helix motif in the 
N-terminus of the protein. The repressor protein encoded 
by bIL309 is nearly identical to Rro, suggesting that rlt 
and bIL309 are homoimmune phages. 

An isolated clear-plaque mutant of phage <1>LC3 contains 
an amber mutation in orf286, suggesting that the gene 
encodes the immunity repressor. However, this has not 
been further investigated (10, 69). ORF286 of <DLC3 belongs 
to the rlt-type of repressors, which are almost identical 
to ORF4 of Tuc2009 except for four amino acids. The 
two phages <X>LC3 and Tuc2009 are therefore proposed to 
be homoimmune (10). 

Activity of two divergent promoters, PI and P2 in OLC3, 
was confirmed by the use of transcriptional fusions, and 
transcriptional start sites were identified by primer exten¬ 
sion analysis (10, 11). Two operator sites overlapping both 
promoters were identified by footprint analysis, while 
binding to a third operator site located 500 bp downstream 
in the lytic operon was shown by gel retardation (10) 
(figure 36-3). The binding region contains only imperfect 
inverted repeats with different spacing. However, a direct- 
repeat sequence, CGTGGTT, was also identified (10). The 
proposed asymmetric operator sites were found for the 
almost identical Cl from phage Tuc2009 and alignment of 
the promoter regions of the two phages shows considerable 
identity (66). 

In vitro experiments predict that the repressor ORF286 
of OLC3 can repress both promoters, PI and P2. However, 


only a 2-fold repression of the lysogenic promoter PI was 
observed in vivo (10). Interestingly, the activity of the PI 
promoter increased 10-fold when both repressor and the 
Cro-like protein, ORF76, were present and it was sugges¬ 
ted that ORF286 is able to stimulate its own synthesis at 
low concentrations (11). Furthermore, the results suggest 
that ORF76 is able to repress the activity of both the lytic 
and lysogenic promoters and it was demonstrated that 
ORF76 binds specifically to the genetic switch region, albeit 
with a lower affinity than the repressor (11). The operator 
sites for ORF76 binding could not be determined based on 
gel retardation experiments and the authors suggest that 
the repressor (ORF286) and the Cro-like protein (ORF76) 
compete for DNA binding (11). 

By the use of a plasmid-borne translational fusion it 
was shown that the lytic promoter (P2) of Tuc2009 is 
induced by addition of mitomycin C in the presence of the 
Cl and Cro proteins. Introduction of a nonsense mutation 
in cl did not result in constitutive P2 expression. Instead, 
a gradual activation was seen, similar to results obtained 
with phage rlt (87). His-tagged Cro protein binds to the 
intergenic region and, in contrast to the results obtained 
with the similar ORF76 protein encoded by ®LC3, it is 
proposed that Cro of Tuc2009 functions as an activator for 
the lytic promoter (D. van Sinderen, personal communica¬ 
tion) Further experimentation is required to clarify the 
role of the Cro-like proteins in regulation of the genetic 
switch of phages OLC3 and Tuc2009. 

The Cl proteins from phages rlt and Tuc2009 differ 
only in their N-terminal region. A helix-swap experiment 
was performed (D. van Sinderen, personal communi¬ 
cation) resulting in a Tuc2009 repressor that binds to 
rlt operator sites but not to Tuc2009 operators. This finally 
identifies the specificity of the DNA binding to the helix 
region KTTISNYEV of Tuc2009. It was also shown that 
His-tagged repressor proteins from both rlt and Tuc2009 
give immunity to the corresponding phage. However, the 
phages are heteroimmune as expected (D. van Sinderen, 
personal communication). 

ORF37 encoded by the second gene in the lytic operon 
of phage BK5-T was proposed to be involved in the 
genetic switch, since it seems to bind to the promoter 
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region and also contains a putative helix-turn-helix DNA- 
binding motif. However, ORF 37 does not show similarity to 
proteins encoded by P335 phages, but it shows similarity 
to topological Cro proteins in Streptococcus thermophilus 
phages (78, 88). 

In phage TP901-1 two divergently located promoters, 
P R (lysogenic) and P L (lytic), were identified by primer exten¬ 
sion analysis and promoter fusions (75, 76). The phage 
repressor (Cl), encoded by or/4, was shown to be necessary 
for repression of both promoters and to confer immunity 
of the host strain to TP901-1 infections (76). Two clear- 
plaque mutants of TP 901-1 have been analyzed, both 
containing mutations in the ribosome binding site for Cl 
that reduce Cl expression from the phage (K. Hammer, 
unpublished results). The Cl repressor was shown to bind 
in vitro to a 317 bp DNA fragment covering P L and P R , and 
the binding was found to be cooperative (56). DNase I foot¬ 
printing identified two palindromic binding sites, 0 L and 
0 R (figure 36-3). Three operator mutants selected as having 
increased expression from the lytic promoter, P L , were all 
found to be located in 0 L , confirming the major role of 0 L in 
repression of the lytic promoter. A third 0 site, 0 D , located 
downstream of mor, was identified by gel retardation (56). 
The purified repressor was found to exist in solution both 
as a dimer and as multimeric forms (hexamers or higher), 
suggesting that multimeric forms of the repressor could 
be involved in cooperative binding to all three operator 
sites. The location of the operator sites results in more repres¬ 
sion of the lytic promoter, P L , than of the lysogenic promoter, 
P R , during cooperative binding of the repressor. 

In phage TP901-1 the first gene in the lytic operon ( mor 
for modulator of repression) was found to influence the 
repression of the lytic promoter, P L , and the results sug¬ 
gested that the relative amounts of Cl and MOR after phage 
infection determine the decision between a lytic or lysogenic 
life cycle (76). For instance, when L. lactis ssp. cremoris is 
transformed with a plasmid containing the two divergent 
promoters fromTP901-l (P L and Pr), eland mor clonal varia¬ 
tion is observed, since P L is either active or repressed in each 
transformant. The clonal variation required the presence of 
both cl and mor. The repression of the promoters was still 
dependent on the repressor since MOR alone had no effect 
on the activity of the P L promoter. However, a surplus of 
MOR was able to relieve Cl repression; conversely when P L 
was derepressed, a surplus of the Cl repressor resulted in 
repression of the promoter even though MOR was present. 
Furthermore, mitomycin C induction of the lytic P L pro¬ 
moter required the presence of both Cl and MOR. CI-MOR 
protein interactions thus have been suggested to explain 
the data for this TP901-1 switch (76). Phage bIL285 encodes 
nearly identical repressor, operator sites, and MOR proteins 
suggesting that bIL285 and TP901-1 are homoimmune 
and furthermore that the genetic switch is identical in 
the two phages. The temperate phages ®PVL and <[>1205 
from Staphylococcus aureus and Streptococcus thermophilus, 


respectively, also show high similarity toTP901-l indicating 
the same kind of genetic switch also in these phages (56). 

Surprisingly, the virulent phages ®31 and ul36 were 
found to encode Cl homologs (65,77) which show some simi¬ 
larity to the N-terminal of theTP901-l repressor (table 36-3). 
By the use of plasmid-borne transcriptional fusions in the 
non-host strain, MG1363, it was shown that the presence 
of the cl gene of 031 does not repress transcriptional initia¬ 
tion of the lytic promoter, PI, whereas expression from 
the lysogenic promoter, P2, was repressed 2-fold. In contrast, 
the presence of the first gene (cro) in the lytic gene cluster 
of 031 efficiently represses both promoters (77), which 
may represent the ability of CRO to shut down early gene 
expression during infection. It seems likely, therefore, that 
Cl has lost the ability to repress the lytic phage promoter. 
To evolve into a potent virulent phage, 031 may further¬ 
more have accumulated mutations in the cl gene giving 
a negative dominant phenotype in the presence of an 
active repressor from the host chromosome. The finding 
that small C-terminal deletions in Cl re-establish the abi¬ 
lity of the mutated Cl to re-press the 031 lytic promoter 
supports this hypothesis (38). The truncated Cl protein was 
further shown to repress 10 different virulent phages, 
demonstrating a whole family of virulent phages which 
have probably evolved from a temperate phage having 
aTP901-l-Iike repressor. Phage mutants that were resistant 
to the repression from the truncated Cl repressor identi¬ 
fied palindromic operator sites overlapping the lytic pro¬ 
moter and at the end of cro (38). By sequence comparison 
a third operator site was located between the — 35 regions 
of the two divergent promoters. The location of the three 
operator sites matched exactly the location identified for 
the operator sites in TP901-1, and the operator sequences 
also showed considerable similarity. The CRO protein of 
<[>31 showed only 50% identity to the MOR protein in 
TP 901-1. The protein, however, retains a different function 
since it is able to repress both of the early phage promoters 
by itself. 

The Cro-like protein is not encoded as the first gene in 
the lytic operon in all temperate P335 phages (figure 36-3). 
In phage ®LC3, ORF63 is identical to the correspond¬ 
ing orf36 in phage BK5-T and 0RF4 of phage bIL309 
(figure 36-3), while some similarities (50%) to 0RF8 of 
phage rlt and 0RF9 of phage TP901-1 were found. It has 
been reported that these proteins of phages <PLC3 and 
BK5-T do not bind to promoter-containing DNA in vitro, 
suggesting that they may not be involved in the genetic 
switch (10, 78). Future research may clarify the role (if any) 
of the genes preceding the Cro-like gene in the regulation of 
the genetic switch in temperate P335 phages. 

Superinfection Exclusion 

Superinfection exclusion (Sie) functions are expressed 
in the prophage and confer immunity of the host cell to 
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heterologous superinfecting phages; they are not involved in 
maintenance of the lysogenic state. Recently the superinfec¬ 
tion exclusion gene (sie 2 009 ) has been identified in phage 
Tuc2009 (81). Sie 2 oo 9 is associated with the membrane 
and is predicted to contain a transmembrane helix. Expres¬ 
sion of sie 2009 in L. lactis provides little or no protection 
against phages of P335 or c2 species, whereas complete 
resistance against some members of the 936 phage species 
was obtained (81). The authors suggest that phage proli¬ 
feration was prevented by injection blocking since cells 
expressing Sie2009 allow phage adsorption but not trans¬ 
duction. The P335 phages rlt and <t>LC3 encode proteins 
highly similar to Sie 2 oo 9 (figure 36-3, table 36-3), and in 
®LC3 lysogens high levels of mRNA were detected from 
this gene (orfl73) (11). 0RF2 of phage bIL309 shows 
similarity in the C-terminal end to Sie 409 encoded by a 
prophage of L. lactis, IL409, that was found to mediate a 
phage-resistance phenotype similar to Sie 200 9 (81). A pro¬ 
phage in L. lactis F 7/2 encodes another superinfection exclu¬ 
sion system consisting of two genes (sie P7 / 2A and sie P7 / 2B ), 
which both must be expressed to obtain a phage resistance 
phenotype for three 936-type phages tested. Siep 7 / 2 A and 
Siep 7 / 2 B are identical to 0RF2 and 0RF3 of phage TP901-1 
and show high similarity to 0RF2 and 0RF3 of bIL285 (81). 
In accordance with this, or/2 and or/3 of phage TP901-1 
are expressed in a strain lysogenic for TP 901-1 (75). ORF34 
of phage BK5-T (figure 36-3) is expressed in the prophage 
and it was suggested that these proteins might have a 
role in superinfection exclusion, although no evidence 
for this function was obtained (14). 

Antirepressor and Genetic Organization 

of the Lysogeny Module 

In E. coli phage PI and Salmonella phage P22 (reviewed 
in chapters 24 and 29, respectively), the antirepressor 
Ant destroys repressor activity by protein-protein interac¬ 
tions. So far no clear evidence for such protein func¬ 
tions in phages infecting lactic acid bacteria has been 
presented. The MOR protein from TP901-1 is suggested to 
function as an antiprepressor (76) based on the in vivo 
results, and the Ant homolog (0RF6) from Tuc2009 has 
been shown in vitro to counteract DNA binding of the 
Cro protein, which has been proposed to function as a 
phage activator (S. Leach, personal communication). In the 
streptococcal phage, Sfi21, the Cro protein has also been 
shown in vitro to inhibit DNA binding of the repressor (20). 
In most P335 phages Ant proteins have been suggested 
based on partial similarity to the Ant proteins of phages PI 
or P22 or other phage proteins located downstream of the 
cro-like gene. The putative antirepressors of P335 phages 
contain a two-domain structure (72), and, based on amino 
acid similarity, two classes of antirepressors are observed 
(table 36-3). No significant similarity was found between 
the two classes. 


The genetic organization of the lysogeny module of 
Sipoviridae phages infecting low G-C content Gram-positive 
bacteria is highly conserved (72) with the following gene 
order: integrase, superinfection exclusion, in some cases a 
metalloproteinase motif gene, repressor, and a cro-like gene 
followed by a putative antirepressor (figure 36-3). However, 
some phages contain insertions of up to three genes 
upstream of the cro-like gene (figure 36-3). The recent 
results from the P335 phages show that the metallopro¬ 
teinase is part of a two-gene superinfection exclusion 
system in some phages (81). Furthermore, our comparative 
analysis identified the excisionase gene downstream of 
the putative antirepressor in all the sequenced temperate 
P335 phages including BK5-T. Downstream of the excisio¬ 
nase gene the temperate P335 phage contains a variable 
number of genes, which currently are of unknown function. 
However, in phage TP901-1 we have observed that trans¬ 
cription from the early lytic promoter, P L , is repressed 
during the later stages of infection and that some of these 
genes with otherwise unknown functions may potentially 
be involved in this process (K. Hammer, unpublished data). 


DNA Replication 

Even though bacteriophages require many bacteria-encoded 
proteins for replication, they often, in addition to the 
phage origin of replication, contain a genomic region that 
specifies a protein used for sequence-specific initiation of 
replication and one or more proteins involved in repli¬ 
cation initiation. A phage encoded resistance (Per) pheno¬ 
type has been described among the P335 phages as 
resistance against phage infection by hosts containing the 
phage origin of replication in multiple copies. Titration of 
proteins involved in DNA replication has been suggested 
to account for this phenomenon (49). 

In bacteriophage Tuc2009 a 160 bp internal region in 
orfl6, containing four direct repeats, inhibited DNA repli¬ 
cation (and hence lytic growth of Tuc2009) when present 
on a multi-copy plasmid. A library of mutations in this 
fragment (ori 20 09 ) was analyzed and most of the mutations 
that abolished the inhibitory effect on proliferation of phage 
Tuc2009 were localized in the direct repeats (82). Using 
gel retardation analysis, 0RF16 was shown to be able to 
bind to ori 2 009 and 0RF16 therefore was proposed to be 
the replisome organizer of Tuc2009, now called Rep 20 09 . 
Interestingly, Rep 200 9 could still retard the DNA fragments 
harboring the isolated Tuc2009 proliferation-inhibiting 
mutations and the authors propose that another protein, 
in addition to Rep 20 09 , may bind to the ori 20 o 9 region (82). 
Recently, it was shown that strains producing antisense 
mRNA of rep 200 9 and orfl7 acquired a very pronounced 
resistance against Tuc2009 and a reduction in internal 
phage DNA replication, indicating that the protein product 
of orf!7 very likely is involved inTuc2009 replication (80). 
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A Per phenotype of ori 2009 was observed for phages 
030, 033, and ul36 (80) and the presence of the putative 
replication protein and origin of phage u!36 were similarly 
shown to inhibit proliferation of phage ul36 and 033 (13). 
This verifies that these origins are functional homologs. 
In addition, sequence analysis of phage ul36 (figure 36-5) 
and hybridization studies of phages 030, 033, and ul36 
showed regions that are highly similar to ones found in 
Tuc2009 (or/15 to orfl9). However, orfl8 of Tuc2009 encod¬ 
ing a putative type II methyltransferase was not present 
in these phages (13, 80). Two strains producing antisense 
mRNA of theTuc2009 genes rep 2 oo 9 and orfl7 showed resis¬ 
tance to the phages Q30, 033, and ul36, which strongly 
suggests that both rep 2009 and orfl7 are involved in DNA 
replication in these phages (80). 

The protein product of orfll (Proll) of phage rlt was 
shown to bind specifically to sequences located within orfll 
and footprinting revealed that the protected DNA region 
contained four 6 bp direct repeats, suggesting that these 
repeats constitute the ori of phage rlt (119). No in vivo experi¬ 
ments have been provided. However, the Proll protein 
shows similarity to the replication proteins G38P of B. subti- 
lis phage SPP1, indicating that Proll may be involved in 
rlt DNA replication (phage SPP1 is reviewed in chapter 23). 

Within orfl3 of phage TP901-1, several repeats were 
located and the presence of these repeats on a plasmid 
in trans was shown to confer Per resistance of the host 
strain against TP901-1 infection (120). TP901-1 prophages 
that contain mutations in the replication region were 
constructed. One prophage mutant contained mutations 
in the repeats without any change of the amino acid 
sequence of 0RF13, while an amber stop codon in orfl3 
was introduced in another. When the mutated prophages 
were induced, internal phage DNA replication was absent 
and the number of plaque-forming phages was reduced 
10 -fold compared with the wild-type. This supports the 
hypothesis that the repeats are the origin of replication of 
TP901-1. Furthermore, the effect of the amber mutation 
in orfl3 on phage proliferation and internal phage DNA repli¬ 
cation could be relieved by the presence of an amber 
suppressor, showing that 0RF13 is involved in TP 901-1 
DNA replication (120). 

A Per phenotype identified putative highly identical 
origins of replication in the virulent phages 031.1 and 
ul36.1 (13, 37). ORF49 of BK5-T showed significant simi¬ 
larity to the putative replication proteins of phages 031.1 
and ul36 (ORF269 and ORF235, respectively), and a repeat- 
rich region within orf49 was shown to confer phage resis¬ 
tance (78). In the virulent phage 031 an AT-rich region 
present on a high-copy-number plasmid reduced the effi¬ 
ciency of plating of phage 031 by 10~ 6 , indicating that this 
region may carry the origin of replication (77). The 031 
origin of replication was located downstream of a putative 
primase gene and its position in a noncoding region is 
unique among the P335 phages. 
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Figure 36-5 Genetic organization of the replication region of 
P335 phages. Identified and putative functions are indicated 
at the top of the figure and genes are aligned according to 
this. Putative topoisomerases, top; single-stranded binding 
proteins, sst>; replication initiation proteins, rep; homologs 
to ORFI7 of Tuc2009, orfl7 200g ; DnaC homologous proteins, 
dnaC ; methylase, met; putative Holliday junction resolvases, 
rusA. Genes encoding similar proteins are shown as arrows 
with identical pattern. No similarity to other phage proteins, 
white, similar proteins with no identified function, black or 
gray. Small, black boxes indicate the location of origin of 
replication identified by experiments. 


The replication proteins of P335 phages can be divided 
into several similarity groups (table 36-3). One contains 
the TP901-l-like replication proteins among which the 
replication proteins of phages BK5-T, ul36.1, and <531.1 are 
highly similar. Computational analysis suggested that the 
replication protein of phage TP901-1 might be divided into 
two domains linked by a short hinge region that encodes 
the origin of replication (36). The replication proteins of 
phages Tuc2009, ul36, and bIL285 are highly similar, 
while the replication proteins of phages bIL309, 4268, rlt, 
and bIL286 show some similarity to the Tuc2009-like repli¬ 
cation proteins (figure 36-5, table 36-3). However, these 
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homologies are restricted to either the N-terminus (rlt) or 
the C-terminus (bIL286, bIL309, and 4268) of the protein, 
indicating a two-domain structure as suggested by (36). 
The putative replication protein of phage <P31 does not 
show any similarity to either group of lactococcal phage- 
encoded replication proteins. However, some similarity to 
S. thennophilus phages was found (77). 

Based on amino acid similarity several proteins have 
been suggested to be involved in phage DNA replication 
(figure 36-5). These are putative topoisomerases, single- 
stranded binding proteins, DnaC homologous proteins, 
Holliday junction resolvases, and dUTPases. The function 
of the putative topoisomerase and single-stranded bind¬ 
ing protein does not seem to require corresponding similar 
replication proteins, suggesting that no specific protein 
interaction takes place during phage DNA replication 
(compare TP901-1 and Tuc2009 in figure 36-5). The rlt 
RusA protein encoded by orfl4 was shown to resolve 
Holliday junctions in vitro, and it furthermore promotes 
DNA repair in E. coli strains lacking the RuvABC resolvase 
(103). The putative Holliday junction resolvase encoded 
by many P335 phages may thus possibly resolve branched 
DNA structures formed during DNA replication or alterna¬ 
tively during DNA packaging. Two different types of Holliday 
junction resolvases are found. However, the presence of 
a specific type seems to be independent of the sequence of 
the replication protein. Interestingly, all P335 phages 
encode nearly identical dUTPases, which are significantly 
different from the chromosomally encoded protein in 
L. lactis IL1403. 

Regulation of Middle and Late 

Cene Expression 

In the virulent bacteriophage <D31 a promoter has been 
identified that is induced 20 minutes after infection, that 
is the middle phase of infection as defined by the authors 
(114). Transcription from this promoter is induced by the 
presence of a phage <P31-encoded activator located upstream 
of the promoter on the 031 genome (89, 114). The activator 
of 031 was suggested to bind to sequences overlapping 
the —35 region of the promoter and deletion analysis 
showed that removal of the region from —54 to —45 elimi¬ 
nated promoter activity (114). The promoter and corre¬ 
sponding activator are conserved between phage 031 and 
the two temperate bacteriophages, rlt and OLC3 (113). 

A TP901-1 promoter, active 30-40 minutes after infec¬ 
tion, has been identified. It was designated the late promoter 
of phage TP901-1 since it becomes fully active in the late 
phase of the lytic cycle and controls transcription of the 
late-expressed region of the TP901-1 genome (22). This 
promoter is located upstream of the terminase subunits, 
a location that corresponds to the middle promoter of 031. 
Thus, the two promoters may be functionally equivalent. 
The TP901-1 promoter is tightly regulated and requires 


ORF29 for activity. The transcriptional start site of the 
promoter was identified by primer extension and is located 
in the intergenic region between orf29 and or/30. The 
region located —85 to —61 bp upstream of the transcrip¬ 
tional start site was shown to be necessary for promoter 
activity and a region from —79 to —32 was found to be 
protected by ORF29 using footprinting analysis; this region 
contains four direct repeats (92). Similarity searches 
revealed that an ORF29 homolog and a sequence similar 
to the late promoter region are present in several genomes 
of temperate lactococcal bacteriophages (high similarity: 
Tuc2009, bIL286, ul36, bIL309; low similarity: 4268 and 
bIL285) (22). 


DNA Packaging 

The existence of cohesive DNA ends and hence cos sites 
has been demonstrated for the phages OLC3 and BK5-T 
(69, 78, 111). Identical cos sites (13 bp single-stranded 3' over¬ 
hangs with the sequence: 5'-GTGACGGCGTGAA-3') were 
determined for phages (DLC3 and rlt (69, 111) while the 
BK5-T cos site was identified as a 12 bp 3' single-stranded 
overhang with the sequence 5'-CACACACATAGG-3'. The 
latter sequence was furthermore found upstream of or/40 
in the bIL286 genome, indicating that packaging of the 
bIL286 genome is initiated at this cos site. By sequence 
similarity the large subunit of the terminases could be 
identified in the cos-site P335 phages, and the terminases 
of phages BK5-T, bIL286, bIL285, and bIL309 show simi¬ 
larity (table 36-3). The N-terminus of a partly sequenced 
ORF encoded by <DLC3 is highly similar to the large sub¬ 
unit of the terminase of phage rlt. This protein shares 
only limited similarity with other terminase subunits of 
lactococcal phages. In contrast, the protein shows simi¬ 
larity to terminases of Siphoviridae phages infecting high 
G-C content Gram-positive bacteria. 

DNA packaging of temperate phage TP901-1 and 
Tuc2009 was suggested to be initiated at a pac site located 
upstream of or/30 of TP901-1 and in a region upstream 
of orf39 ontheTuc2009 genome (4, 23, 57). Based on protein 
similarity, protein domains, and location, orf30 and orf31 
of TP901-1 were suggested to encode the small and large 
subunits of the terminase (23). ORF 31 and ORF 32 of 
Tuc2009 show similarity to the N- and C-terminal of the 
large terminase subunit of TP 901-1, respectively, indicating 
that ORF31 and ORF32 of Tuc2009 may encode separated 
domains of the large terminase subunit. 

Morphogenesis 

Structural phage proteins have often been identified by 
SDS-PAGE and N-terminal sequencing, followed by anal¬ 
ysis of the phage DNA sequence. SDS-PAGE investigations of 
rlt phage particles showed the presence of two very large 



586 PART V: PHAGES BY HOST OR HABITAT 


protein bands of 160 and 190 kDa each (111). They both had 
an N-terminus corresponding to ORF31, strongly indicat¬ 
ing that these proteins are covalently linked head proteins 
as observed for mycobacteriophage L5 and coliphage 
HK97 (46, 96). Another very abundant protein, ORF37, 
is probably the major tail protein (111). Less abundant 
protein bands corresponding to ORF27 and ORF45 were 
also observed. Similarity searches reveal that ORF27 may 
encode the large subunit of the terminase, whereas ORF45 
shows weak similarity to the baseplate protein of phage 
TP901-1. A self-splicing group I intron seems to be contained 
within ORF41 (85). 

Phages TP901-1, Tuc2009, and u!36 show extensive 
similarity in the genomic region encoding the structural 
proteins. The major differences are found in the base¬ 
plate proteins, the proteins specifying the neck passage 
structure (which is not found in ul36), and the major 
head proteins. In TP901-1 the major head protein (ORF36) 
was identified by SDS-PAGE and immunogold electron 
microscopy (58, 59). It is similar to the major head pro¬ 
tein of Tuc2009 (major protein 2) encoded by or/47 and 
or/39, which is suggested to be linked after excision of orf38 
that specifies a group I intron. Phage uI36 encodes a major 
head protein that is different from all other P335 phages 
(65). The major tail protein and a baseplate component 
were identified in TP901-1 (59, 93). They are similar to 
proteins identified as major structural proteins 2 and 1 of 
Tuc2009, encoded by orf44 and orf53, respectively (4, 30). 
A tail component that radiates from the connection bet¬ 
ween head and tail called a neck passage structure was 
identified in TP901-1 (59). The neck passage structure 
(ORF51) shows high similarity to ORF47 from phage rlt 
and to ORFs encoded by phages belonging to the lactococ- 
cal 936 phage group. It has been shown that ORF45 func¬ 
tions as a tape measure protein involved in tail length 
determination, since introduction of an in-frame deletion 
or duplication in orf45 of TP901-1 shortens or lengthens 
the TP901-1 tail, respectively. Furthermore, it was shown 
that ORF45 is important for assembly of the TP 901-1 
tail and that the baseplate and tail structures of TP901-1 
assemble through a branched baseplate and tail assembly 
pathway (93). 

In phage BK5-T, two major structural proteins were iden¬ 
tified and N-terminal sequence analysis determined that 
they were encoded by or/7 and orfl2, respectively (78). The 
major head protein, encoded by or/7, was shown to be 
subject to proteolytic cleavage, since the protein product 
identified as the major head gene starts at amino acid 
number 110. The BK5-T protease responsible for the proces¬ 
sing is most likely the ClpP like protease, encoded by or/6. 
The other major structural protein, encoded by orfl2, is 
probably the major tail protein (78). The major head protein 
of BK5-T is nearly identical to the putative major head 
protein of phage bIL286 and 4268, whereas some similarity 
was found to the major head protein of bIL285 (table 36-3). 


Phages bIL285, bIL286, bIL309, and 4268 furthermore 
encode a protease, suggesting that the major head proteins 
of these phages also are subject to proteolytic cleavage. 

Cell Lysis 

Although some bacteriophages have evolved alternative 
systems for cell lysis, a generalized model involves the 
action of two principal proteins, holin and lysin (see chapter 
10 for a general review of phage lysis). According to this 
model the holin molecules mediate the transport of the 
lysin protein across the cytoplasmic membrane by creat¬ 
ing pores in the membrane. Subsequently, lysin degrades 
the bacterial cell wall from the outside. The P335 phages 
encode two types of murein-degradation enzymes (lysins) 
that attack either the glycosidic linkages between the 
amino sugars of the peptidoglycan (called muramidases) 
or the N-acetylmuramoyl-L-alanine amide linkage between 
the glycan strand and the crosslinking peptide (called 
amidases). 

The lysin of phages Tuc2009 and OLC3 has been shown 
to have lytic activity against L. lactis and was suggested 
to encode muraminidase activity (4, 8). Lysin and holin 
proteins of phages Tuc2009, <DLC3, and TP901-1 were 
highly similar (table 36-3). ORF48 and ORF49 of phage rlt 
has been suggested to encode a holin and a lysin with 
amidase activity, respectively (111). Also phage BK5-T was 
suggested to encode a lysin (ORF27) with amidase activity, 
while ORF25 and ORF26 of this phage have been proposed 
to encode a two-component holin system similar to that 
suggested for phages infecting S. thermophilus (78, 104). 
Phages bIL285, bIL286, and bIL309 each encode a holin 
and lysin similar to those encoded by BK5-T (table 36-3). 

Concluding Remarks 

Revised Taxonomy 

DNA hybridization studies indicated that BK5-T and a few 
other uncharacterized temperate phages were not related 
to the type phage P335 (15), and BK5-T was therefore not 
originally placed in the P335 species (55). However, in 
the early region of the BK5-T genome identity was found 
to several P335 phages. In addition, BK5-T and the phages 
bIL285, bIL286, and bIL309 show protein similarity in 
the structural gene clusters exemplified by the large termi¬ 
nase subunit and major head protein (table 36-3), suggest¬ 
ing that these five phages belong to the same taxonomic 
group, namely the P335 phage species. The P335 phage 
species could, however, be divided into three different 
groups based on the protein similarity in the structural 
proteins as shown in table 36-5. This new classification 
of the lactococcal P335 phages is in agreement with the 
taxonomy suggested in (30, 35, 65). In contrast to the 
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Table 36-5 Revised Taxonomy of Lactococcal P335 Phages 





Related phages attacking 



Other lactococcal 

different hosts 

Phage type 

Prototype phage 

members 

(in parentheses) 3 

B1; P335 species: cos 

rlt 

<TLC3 

TM4 ( Mycobacterium) 



<I>31 (L) 

SF370.3 (d) (S. pyogenes) 

B1; P335 species: pac 

TP901-1 

Tuc2009 

Sfill and 01205 (S. thermophilus) 



ul36 (L) 

<I)gle and LL-H ( Lactobacillus ) 

A118 ( Listeria monocytogenes) 

SF370.2 (d) (5. pyogenes) 

B1; P335 species: cos 

BK5-T 

blL285 

Sfi21, Sfi 19, 7201, DTI (S. thermophilus) 



blL286 

blL309 

4268 (L) 

<I>adh ( Lactobacillus) 


a S. Streptococcus', L, lytic phage; d, defective phage. 


prolate-headed c2 phages or the small isometric-headed 936 
phages that contain highly similar genomes (more than 90% 
nucleotide identity), the P 335 phages show lower and vari¬ 
able levels of DNA similarity (5-50%). BK5-T and bIL286 
show extensive DNA identity in the structural gene cluster, 
and 45% overall identity, while about three quarters of the 
genomes of TP901-1 and Tuc2009 could be aligned (36). 
Furthermore, the virulent phage ul36 shows extensive iden¬ 
tity to bothTP901-l and Tuc2009 (65). 

Phages infecting different host genera also encode struc¬ 
tural proteins that are similar to the lactococcal P335 
phages, suggesting that these phages are related and could 
be placed in the same taxonomic group. However, in only 
a few cases could identity at the nucleotide level be detec¬ 
ted between phages infecting different host genera. This 
was found between phages BK5-T/bIL286/bIL309 and 
S. thermophilus phage Sfi21, respectively (30, 35), and also 
for phage rlt and Streptococcus pyogenes phage SF370.3 (36). 
A revised taxonomy of the Siphoviridae has been pro¬ 
posed using comparative genomics (19). In this proposal the 
fully sequenced lactococcal phages of the phage species 
c2, 936, and P335 were placed into five genera. As described 
above, members of the P335 group, including BK5-T, do 
constitute three new genera. Furthermore, with the excep¬ 
tion of the c2-like genus, the four remaining lactococcal 
genera were proposed to be members of a X supergroup 
within Siphoviridae since they preserved the X-Iike gene 
order in the structural genes (see chapter 27 for review of 
lambdoid phage comparative biology). 

Evolution 

The existence of two types of lactococcal phage popula¬ 
tions has been proposed (30). One type consists of virulent 
phages with highly similar genomes (the c2 and 936 phage 
species), whereas the other corresponds to temperate phages 
showing very different levels of DNA similarity (the P335 
phage species). The virulent P335 phages may all originate 


from temperate phages by deletion events. Hence, virulent 
phage ul36 contains remnants of a lysogeny module includ¬ 
ing a putative integrase gene (65), while phages 031 and 
4268 contain a putative repressor but neither a phage 
attachment site nor an integrase gene has been identified 
(77; AF489521). However, the virulent phages of the P335 
species may in principle return to the temperate status by 
acquiring the deleted part of the lysogenic module from 
related temperate phages. This cannot be achieved by viru¬ 
lent phages of the c2 and 936 species, since they do not 
have any temperate counterpart (30). 

Many examples of horizontal gene transfer amongst 
the temperate phages may be suggested on the basis of 
genome comparisons (e.g., see chapters chapters 4 and 27), 
but biological experiments have also shown this to be 
the case. Phage <t>50, which is completely resistant to a 
plasmid-encoded R/M system, was found to encode a func¬ 
tional domain of the methylase gene identical to the plas¬ 
mid-encoded methylase. This strongly indicates a recent 
genetic exchange between the phage and the plasmid (50). 
Mutant phages of 030, 033, and ul36 were isolated when 
the corresponding wild-type phage was plated on strains 
containing a plasmid with multiple copies of ori2009, 
normally conferring a phage-resistance phenotype. Two 
of three analyzed phages showed genomic reorganizations 
in the DNA replication region, while the third may harbor 
point mutations (80). Similar results were obtained for 
phage 031 (37). In both cases it was speculated that the 
recombinant phages have acquired new DNA replication 
regions by homologous recombination with prophage or 
prophage remnants in the host chromosome. Also the 
presence of an abortive resistance mechanism, abiK, that 
was shown to interfere with phage DNA replication resulted 
in the occurrence of recombinant phages after challenge 
with phage uI36 (13). These authors moreover showed 
that the sequence obtained by the recombinant phages origi¬ 
nated from the host chromosome and defined the point 
of exchange between the chromosome and the incoming 




588 PART V: PHAGES BY HOST OR HABITAT 


phage (13). More generally, it has been suggested that 
short regions of micro-similarity located in noncoding 
regions of the P335 phage genomes may be points of 
exchange between phages (23, 25). 

The presence of prophages in the chromosome of the 
host bacterium increases the probability for two phage 
genomes being present in the same cell at the same time, in 
contrast to a simultaneous infection by two different 
phages. Furthermore, the life cycle of virulent phages 
depends on frequent and productive cycles of infection, 
thus greatly limiting the flexibility of incorporation of 
extra DNA except for genes conferring an immediate selec¬ 
tive advantage. Thus, two modes of evolution of phages have 
been proposed: the temperate/lytic mode and the virulent 
mode (30, 48). Extensive horizontal gene transfer, the 
frequency depending on the relatedness of the bacterial 
hosts, characterizes the temperate/lytic mode. The virulent 
mode shows very little DNA exchange with other phage 
species, with the exception of host-related functions. An 
example of this is the similarity found in the presumed 
host specificity region of all the major lactococcal phage 
species: c2, 936, and all three P335 species (30). This is in 
accordance with the finding that some lactococcal strains 
are hosts for phages belonging to different species. 

Outlook 

Important aspects of lactococcal phage research have to 
some extent been limited by the difficulties in identify¬ 
ing sensitive lactococcal host strains. Transduction of a 
selectable marker is more sensitive than plaque assay, as 
shown for phages OLC3 and TP901-1, and will be a useful 
tool to identify host strains (9, 61). General transduction 
of chromosomal markers was only reported in 1962 (1), 
while plasmid transduction has been studied using phage 
T712 (43). Transduction of plasmids containing the cos site 
of the transducing phage has been reported for phages 
®LC3 and ski (9, 29). Development of general transduc¬ 
ing systems for chromosomal markers would be a very 
helpful genetic tool for future research in L. lactis. 

Functional analysis of the lactococcal phages by isola¬ 
tion and characterization of phage mutants is lagging 
behind the wealth of sequence data. The genetic tools exist, 
but have so far been used only in a few studies: Nonsense 
suppressors have been used in TP901-1 (93, 120) as well as 
gene disruptions (61). Clear-plaque mutants have been 
isolated in ®LC3 (68) and TP901-1 (K. Hammer, unpub¬ 
lished data). Phages resistant to different naturally occurr¬ 
ing abortive infection systems ( abi) have been selected 
and analyzed (2, 40). An increased effort in the area of 
mutant isolation and characterization, combined with 
DNA-array analysis during infection, will give important 
information on structure-function analysis, including the 
gene regulatory mechanisms of the lactococcal phages 
and their interplay with the bacterial host. The existence 


of many different naturally occurring abi systems for 
lactococcal phages will greatly magnify the possibilities 
of insight into these important topics. 

Note Added in Proof 

The following findings of phage antireceptors involved 
in determination of host specificity have been published. 

In the 936 phage species the Orfl8 in ski (K.Dupont 
et al. 2004. Appl. Environ. Microbiol. 70:5818-5824). In the 
prolate c2 species the L10 protien is involved in addition to 
L15 (J. Rakonjac et al. 2005. J. Bacteriol. 187:3110-3121. 
In the P335 species ORF 49 of TP901-1 has been shown to 
form the lower baseplate disc of the phage tail and is sug¬ 
gested to function as the antireceptor (C. Vegge et al. 2005. 
]. Bacteriol. 187: in press). 
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B acterial viruses specific for the genus Listeria were 
discovered almost 60 years ago (57), and were early 
reported for their usefulness in phage typing (61) of iso¬ 
lates of the pathogen Listeria monocytogenes (65). In the 
following years, phage typing of Listeria isolates has 
proven to be a very useful method, and led to the isolation 
of more than 400 phages for L. monocytogenes, L. ivanovii, 
and the nonpathogenic species L. innocua, L. seeligeri, 
and L. welshimeri (5, 9, 16, 19, 22, 25, 26, 31-33, 48-51, 53, 
55, 63). To date, no phages infecting organisms of the species 
L. grayi have been found. This chapter briefly summarizes 
our present knowledge on Listeria phages, and gives an 
overview on their general and particular properties, with 
respect to both the basic science and the various practical 
applications. 

Ultrastructure, Composition, 
and Taxonomy of Listeria Phages 

Electron microscopical examinations of more than 120 
Listeria phages (2, 5, 7, 10, 33, 49, 51, 61, 71) revealed a rela¬ 
tively limited diversity (figure 37-1). Most phages belong 
to the morphotype group B1 of the Siphoviridae family 
(isometric capsid and long, noncontractile tail), in the order 
Caudovirales (1) (see chapter 2 for an overview of phage 
classification). These Siphoviruses are divided into five 
recognized species and one proposed species (table 37-1) 
based upon differences in tail length (3). The remaining 
phages were classified as Myoviridae of the morphotype A1 
(isometric capsid and long, contractile tail). Two species 
were established, based on different particle dimensions 
and mode of sheath contraction. While members of the 
species A 511 resemble a more commonly found phage 
morphotype, phages of species 4211 (such as 01761 depicted 
in figure 37-1) are unique and feature a rather unusual 
mode of sheath contraction in which the tail seems to 
contract toward the baseplate, thereby exposing the inner 
tail tube starting from the capsid. 


Caudovirales generally have double-stranded DNA as 
genetic material. Restriction endonuclease analysis allowed 
discrimination of individual Listeria phages and calcula¬ 
tion of genome sizes, which range from 36 to more than 
100 kb (table 37-1). The G-C content lies between 35 and 
41 mol % (36, 38, 54, 70; unpublished data). A signi¬ 
ficant correlation between ultrastructure and overall DNA 
homology was found, which supports the existence of 
at least five DNA homology groups among the phages 
investigated (38, 54). Structural proteins of more than 
40 phages were analyzed by electrophoretic methods 
(33, 38, 70, 71), and, very recently, mass spectrometry (70). 
Several studies indicate that the major capsid proteins 
are proteolytically processed during maturation of the 
head, while from the tail sheath proteins only the 
N-terminal methionine is absent (36, 41, 70). The PSA virion 
employs a particularly interesting mechanism for synthesis 
of essential components, involving a translational +1 frame- 
shifting at the 5' ends of the major structural protein 
genes for the capsid ( cps ) and tail ( tsh ) (70). Comparison 
of protein profiles permits differentiation of phages and 
establishment of similarity clusters, and generally corre¬ 
sponds well with ultrastructure and DNA hybridization 
patterns. 

Host Ranges and Phage Receptors 

All Listeria phages are strictly genus-specific. The temper¬ 
ate ones only recognize host bacteria of individual but speci¬ 
fic serovar groups, while the virulent A 511-like phages 
can attack strains of all species and serovars (4, 5, 31, 32, 
50, 52). The Listeria O-antigens are largely determined 
by the variable structure and sugar substitution of poly- 
ribitolphosphate cell wall teichoic acids (17). It has been 
demonstrated by biochemical and genetical approaches 
that the teichoic acid substituents, N-acetylglucosamine 
and rhamnose, are major determinants of phage adsorp¬ 
tion in serovar 1/2 strains, while N-acetylglucosamine and 
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Figure 37-1 Electron micrographs of different Listeria 
bacteriophages. The phages shown belong to different 
morphotypes and species. A511 and 01761 are Myoviridae 
with contractile tails, while A118 and PSA are Siphoviridae 
with noncontractile, flexible tails (see table 37-1 and text). 


galactose are important in serovar 4 strains (11, 62, 68). 
In contrast, teichoic acids are apparently not involved in 
binding of the polyvalent A 511 phage. It is assumed that 
the peptidoglycan itself represents its receptor, possibly 
in conjunction with other, serovar-independent carbo¬ 
hydrates (68). 


Virus Multiplication and Host Cell Lysis 

Listeria phages seem to be well adapted to their host bac¬ 
teria. Most phages can complete lytic cycles at 10-37°C, 
but some are more temperature-sensitive and only multiply 
at 25°C or below (25). Growth curves have been recorded for 
a few phages. At 30°C, the latent period of the lytic cycle 
of temperate phages infecting L. monocytogenes is between 
60 and 70 minutes, followed by a rise phase of 40 to 
65 minutes. An average of 25 progeny virions are released 
from lysed cells. In L. ivanovii as host, latent phases up to 
115 minutes were observed, resulting in a burst size of up to 
40 particles. The virulent nature of A 511 agrees well with 
its shorter latent phase of 55 minutes and comparatively 
large burst size of 40 virions (unpublished data). (For an 
overview of phage infection-timing and burst-size charac¬ 
ters, see chapter 5.) 

Lysis of infected cells occurs through the combined 
action of a holin (Hoi) and an endolysin (Ply), encoded by 
two immediately adjacent genes at the distal end of the 
late gene regions. In phage A 511, the holin has not been 
identified. In phages A118, A500, and cj>2438. Ply (the endo¬ 
lysin) was found to represent a new class of enzyme, an 
L-alanoyl-D-glutamate endopeptidase, whereas phages A 511 
and PSA feature (different) N-acetylmuramoyl-L-alanine 
amidases (44,70,73). An interesting aspect of these enzymes 
is their modular composition and their unusual substrate 
specificity, which is mediated by individual C-terminal 
domains recognizing unique cell wall carbohydrates (37). 
These cell wall binding domains (CBD) have distinct binding 
abilities: The CBD of phage Ply 500 endolysin recognizes 
cell surfaces belonging to serovars 4, 5, and 6, and binds 
to a receptor evenly distributed in the wall. In contrast, 


Table 37-1 Present Status of Listeria Phage Taxonomy, and Main Virion Characteristics 


Family 

Species 

Other relevant 
members 3 

Approximate 13 virion size 
(in nm) 

(head diameter c /tail length d ) 

Approximate 
genome sizes (in kb) 

References 

Myoviridae 

A511 


88/200 

130-140 

31, 38, 44, 71 






(unpublished) 


4211 

BO 54 

62-66/230-270 

41-44 

33,51,71 



01761 




Siphoviridae 

P35 e 


56-60/110 

36 

25, 






(unpublished) 


2389 

PSA 

58-62/170-180 

38 

2, 33, 70 


H387 


58-62/190-200 

36-40 

49 


2685 

B025 

58-62/230-260 

37-41 

33, 54, 71 


2671 

All 8 

58-62/270-310 

38-42 

25, 33, 36, 54, 71, 



A500 



(unpublished) 



A006 





a Phages investigated and sequenced in our laboratories. 

b Different phages within the species, different isolates, and different staining methods yield variable results. 
c Measured from apex to apex. 
d Measured including base plates. 
e Proposed new species. 
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the CBD of phage Plyll 8 binds only to serovar 1/2 and 3 cells, 
with preference for septal regions and the poles. These 
distinctions indicate fundamental differences in carbohy¬ 
drate composition among the cell walls of different strains 
of bacteria as well as cell wall differences along the contours 
of individual bacteria. 

Immediately preceding cell wall hydrolysis, Hoi is pro¬ 
posed to form lesions in the cytoplasmic membrane, enab¬ 
ling access of Ply to the murein (reviewed in chapter 10). 
Although the primary sequences of Holll 8 and Hoi 500 (the 
Hoi proteins from phages A118 and A 500, respectively) are 
completely different from the prototype phage X S protein, 
the overall structures of these class II holins are conserved 
(44). However, timing and regulation of H 0 III 8 function 
is clearly different from the S paradigm (67). Although it 
features a dual-start motif, recent findings have demon¬ 
strated that Holll 8 -mediated regulation of lysis timing 
represents a novelty since it occurs via a cotranscribed intra¬ 
genic inhibitor lacking the first transmembrane domain 
and thereby interfering with pore formation ( 66 ). 

Temperate Listeria Phages 

Genomics 

The first Listeria phage completely sequenced and analyzed 
in detail was A118, a temperate bacteriophage specific for 
L. monocytogenes serovar 1/2 strains (36). Its genome is a 
circularly permuted collection of terminally redundant 
dsDNA molecules of an average length of 43.3 kb, which 
indicates 6 % redundancy of the unit size of 40,834 bp. This 
has been confirmed by partial denaturation maps and elec¬ 
tron microscopy, also showing that the right end of DNA 
is attached to the phage tail and that the roughly 10 -rner 
concatemers are sequentially packaged left to right. The 
A118 genome contains 72 ORFs, organized in three major, 
life-cycle-specific gene clusters. The genes required for lytic 
development show an opposite orientation and arrange¬ 
ment compared with the lysogeny control region. A function 
could be assigned to 26 genes, while the remaining 46 have 
no known counterparts. 

The sequence of the prophage of L. monocytogenes 
serovar4b strain ScottA, PSA, was recently completed 
(70). In contrast to A118, PSA features 10-nucleotide 
3'-overhanging cohesive ends and packages exactly one 
unit genome of 37,618 bp. Although its 55 open reading 
frames are mostly unrelated to phage A118’s (figure 37-2), 
their overall life-style-specific gene organization is relatively 
similar, except for the presence of some unique genes such as 
a primase and a helicase in PSA, and a recombinase in A118, 
each of which are required for the different mechanisms of 
genome replication. 

Proteome analysis of PSA revealed an unusual form 
of translational frameshifting which yields different-length 


forms of the major structural proteins of the capsid and tail, 
respectively (70). The proteins feature identical N-termini 
but different C-termini as a result of programmed +1 trans¬ 
lational frameshifting. Frameshifting appears to be initi¬ 
ated by a slippery nucleotide sequence with overlapping 
proline codons near the 3' ends of both genes. This appar¬ 
ently redirects the ribosomes into the +1 frames. Different 
cis-acting factors (a shifty stop and a pseudoknot) are also 
present. This phage PSA attribute is the first case of +1 
frameshift among double-standed DNA phages, and also 
is the prototype of a virus featuring a 3' pseudoknot to 
stimulate ribosomal frameshifts. 

Attachment Sites and Integration 

So far, two different classes of temperate phages in L. mono¬ 
cytogenes have been shown to integrate their DNA into 
the host chromosome. Phage A118 is an example of the first 
class (36). Sequence comparisons indicate that the A118 
integrase enzyme is a serine recombinase related to TnlO 
resolvase and Hin invertase. The A118 bacterial attachment 
site, attB, lies within an open reading frame closely related 
to comK of Bacillus subtilis. This gene encodes a transcrip¬ 
tional activator for various factors involved in compe¬ 
tence for DNA uptake. Since L. monocytogenes is not easily 
transformable, the role of its comK gene is not immedi¬ 
ately obvious. Integration of the A118 genome into comK 
changes most of the coding sequence, and no phage 
sequence reconstitutes this reading frame. The A118 phage 
and bacterial attachment sites display only 3 base pairs 
of homology near the point of crossover, as is common for 
serine recombinases. In contrast, phage PSA integrates into 
the tRNA Arg gene (29). PSA’s phage attachment site contains 
identity to the 15 nucleotides at the 3' end of tRNA Arg plus 
two downstream nucleotides. After integration by PSA, the 
sequence of tRNA Arg is regenerated by prophage nucleo¬ 
tides. PSA’s integrase is different from A118 Int, and is homo¬ 
logous to Escherichia coli XerD, a tyrosine recombinase 
that resolves dimeric circles of E. coli DNA and plasmids of 
the ColEl family (60). 

Prophages and Lysogeny 

Lysogeny is widespread among strains of the genus Listeria ; 
the percentage of strains producing infective phage has 
been estimated to range from 6 % to 37%, depending on the 
species (50). Prophages are readily inducible using UV light 
or mitomycin C (34). Lysogenization can easily be pro¬ 
voked by a high multiplicity of infection, and lysogens are 
generally resistant to superinfection by the same or related 
phages but not by phages of different immunity groups 
(35). Many of the commonly used laboratory strains (e.g., 
10403 S, EGDe, L028, and others) carry an integrated 
functional or cryptic prophage at the attB within comK. 
However, comK and tRNA arg are only two of several 
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Figure 37-2 Alignment of the genetic maps from the temperate Listeria monocytogenes phages. Shown are phages PSA (70), 
A118 (36), and the cryptic prophage 4>EGDe (20). Open reading frames are shown as black arrows, pointing in the direction 
of transcription. The maps start with the terminase genes of the “late” region (left), and end with distal genes of the “early” 
gene region, involved in DNA replication and recombination. Lysogeny control regions (mostly leftward-pointing arrows) are 
located at approximate coordinates 20-30 kb on the ruler at the bottom of the figure. Genes encoding proteins of significant 
amino acid seguence similarity are linked by gray shading. For PSA, the few genes whose products feature significant 
similraity to All8 are indicated by numbers corresponding to the individual open reading frames (70). 


existing attB sites in the Listeria genomes, since multi¬ 
ple lysogens can be created by subsequent challenge 
with different phages (35), and polylysogenic strains are 
frequently observed (see chapter 7 for a discussion of phage 
integration). 

Despite the fact that most if not all strains carry func¬ 
tional or cryptic prophages, the potential influence of lyso¬ 
geny on the host phenotype is unknown. No phenotype 
has been associated with comK inactivation by insertion of 
A118-like phage or by insertion of an integration vector 
(described below). No obvious association was yet observed 
between phage carrier state and Listeria phenotype, espe¬ 
cially with regard to pathogenicity. Nevertheless, temperate 
phage may carry genes similar to host factors involved 
in bacteria-host interaction (36). Resistance to phage was 
sometimes found to be the result of changes in cell wall 
composition (62,68), which could also be linked to decreased 
sensitivity to quaternary ammonium compounds (46). 
Moreover, the presence of phage-encoded methyltrans- 
ferases such as M.LmoA118I (8, 36) affects distribution of 
genetic material and may therefore influence the phenotype 
of lysogens. 

Cryptic Prophages 

Up to 71% (27) of all Listeria cultures produce sub¬ 
stances inhibitory to other Listeria strains but not to other 
bacteria. These substances were termed “monocins” (7, 12, 
22, 47, 61, 69). In monocin preparations, particles that 


resemble phage tails or polyheads could be observed by 
electron microscopy (7, 73). It was later shown by genetic 
methods that these particles indeed result from incom¬ 
plete, cryptic prophages and that their lethal effect is due to 
the presence of intact lysis genes (73). Similar to phage, 
monocins display a killing-from-without effect which to 
some degree is serovar correlated (72). A cryptic phage 
related to A118 was identified in the chromosome of 
L. monocytogenes EGDe (see figure 37-1), and five phage-like 
elements are present in the chromosome of L. innocua 
CLIP11262 (20). 


Transducing Phages 

Many of the temperate Listeria phages are capable of gener¬ 
alized transduction, that is they more or less randomly 
package host DNA and can therefore transduce functional 
genetic markers into other cells (25). Tranduction frequen¬ 
cies range from 10 ~ up to 5 x 10~ , depending on the 
phage and host used. The ability to package non-phage 
DNA appears to correlate with the genome structure of the 
viruses: the terminally redundant A118 does transduce, 
whereas the cos-site phage PSA does not (25). This correla¬ 
tion seems to be the case for other Listeria phages as 
well (unpublished data) and is likely dependent on 
the different DNA packaging mechanisms employed by 
these viruses, which were shown to have unrelated termi¬ 
nase enzymes. 
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Virulent Listeria Phages 

As mentioned above, most Listeria phages are temperate 
and, with the exception of A 511, nothing is known about 
the few virulent phages for the genus. Phage A511, however, 
is a particularly interesting virus. It has a very wide host 
range and can infect approximately 95% of the relevant L. 
monocytogenes strains found to be implicated in foodborne 
disease. Of its large genome, only a fraction has yet 
been analyzed in detail, namely the lysis gene region (44) 
and a 10 kb fragment containing most of the morphopoietic 
genes (40). A useful finding were the powerful promoter 
sequences controlling expression of the major late genes, 
cps and tsh, which enabled design of a reporter phage vehicle 
(see below). 

Relationships and Phage Evolution 

Comparative genomics demonstrates that phage A118 is 
only very distantly related to phage PSA (figure 37-2), but 
highly similar to the cryptic phage, <f>EGDe, found in the 
chromosome ofL. monocytogenes EGDe (20). Most differences 
between A118 and 4>EGDe are found in the early gene region, 
whereas the late gene region is, with the exception of 
the major capsid genes, extremely conserved. Only a single 
gene, encoding a part of the virus tail structure, is conserved 
among these three Listeria-phage genomes. In addition, 
these three phages were shown to contain portions resem¬ 
bling functional regions of other phage genomes, in particu¬ 
lar those infecting lactic acid bacteria and other members of 
the low G-C content subbranch of Gram-positive eubacteria 
(36, 70). 

Of interest is the relatedness of the Listeria phage A 511 
to Staphylococcus aureus phage Twort. Both belong to a 
group of morphologically basically indistinguishable, obli- 
gately virulent myoviruses that infect various Gram-positive 
hosts (21). Phage A 511 was reported to have significant 
homologies in its late gene region to an intron-containing 
sequence ofTwort (28), raising the possibility of the presence 
of self-splicing introns in Listeria phages. Also, in contrast 
to the situation of the temperate phages, it seems more diffi¬ 
cult to explain the presence of almost identical sequence 
elements in viruses that do not exist in a prophage state, 
and do not infect a common host. We have previously 
observed an unusually high rate of recombination in 
A 511 (39), which suggests that these (and other) viruses 
may use some specially adapted mechanisms that augment 
their ability to participate in the genetic mix-and-match 
game. 

Additional indications exist that point to a relatedness 
of Listeria phages to viruses of other closely related Gram¬ 
positive bacteria, such as Brochothrix thermosphacta. Several 
short regions of high DNA homology were identified 
in morphologically unrelated phages of the two genera 


(unpublished data). One such region contains an ssh gene 
that is identical in the viruses of different origin but is 
flanked by unrelated portions of the genomes, which may 
reflect a good example of a limited, modular exchange. 

Taken together, even the limited data on Listeria phages 
which are available to date clearly support the “mosaics” 
model of phage genome building, where phage genomes 
are built from genetic modules. Functional segments are 
accessible through different mechanisms from a large gene 
pool (23, 24). Horizontal exchange in phages is obviously 
dependent on the genetic material of their hosts, which 
may restrict promiscuous exchange. Nevertheless, Listeria 
phage genomic analyses add to the growing evidence that 
individual phages likely have evolved from a limited number 
of ancestral phages. This suggests a divergent evolution of 
bacterial viruses which is strongly influenced by conti¬ 
nuous adaptation and genetic exchange of significant por¬ 
tions of their genomes (see chapters 4 and 27 for further 
discussion of these concepts). 

Applications 

Typing Phage 

Various systems have been reported for phage typing of 
Listeria (4,6,15,16,19,31,32,45, 52, 53,64). Typability (sensi¬ 
tivity to at least one phage of a given set) is heavily depen¬ 
dent on the bacteria: serovar 3 strains are mostly resistant, 
in contrast to the high phage sensitivity of serovar 4 strains. 
Virulent phage with a broad lytic range such as A 511 
increased overall typability from around 70% to more than 
90% (32, 65). Currently this technique provides the simplest 
and most widely used Listeria typing method (45), and 
it provides a sensitive means for tracing the origin and 
course of foodborne outbreaks of listeriosis. 

Reporter Phage 

The potential of genetically engineered phage is widely 
acknowledged in cloning procedures involving well- 
characterized bacteriophages such as phage A,. Because of 
its broad host range, phage A 511 was selected as a candidate 
for construction of a reporter bacteriophage. A genetic 
fusion of Vibrio harveyi lux A and luxB genes was introduced 
into the A 511 genome, under control of the powerful cps 
promoter (39). Following infection of Listeria cells by 
A511::1 kxAB, viral gene expression results in biolumines- 
cent bacteria which can easily be detected and quantita¬ 
tively monitored even in a mixed bacterial flora. A thorough 
evaluation of the system confirmed its usefulness as a 
quick and sensitive method for detection of viable Listeria 
in a variety of foods (40). See chapter 46 for further discus¬ 
sion of this reporter phage approach to bacteria identifica¬ 
tion and diagnosis. 
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Killer Phage 

The potential usefulness of phage in bio-disinfection mea¬ 
sures against Listeria monocytogenes on solid surfaces and 
production equipment was reported (56). However, this 
application faces many potential problems, and much more 
research is needed with respect to the potential of phage 
for eradicating Listeria in immensely complex environ¬ 
ments such as food and feed. See chapter 48 of additional 
consideration of the use of intact phages as antibacterial 
agents. 

Lytic Enzymes 

Listeria phage endolysins can be produced in E. coli, and 
the recombinant enzymes retain high activity after affinity 
purification (42, 43). The enzymes evolved to exhibit strin¬ 
gent substrate specificity, that is they only lyse the Listeria 
cell peptidoglycan, with very few exceptions among closely 
related Gram-positive bacteria. Although the virus uses 
them from within the cell. Ply enzymes work equally well 
when added exogenously. A tiny amount of enzyme is suffi¬ 
cient to clear a dense suspension of Listeria cells within 
seconds. The enzymes are active in a pH range from 6 to 10, 
they are insensitive to common protease inhibitors and 
chelating agents, and moderate concentrations of deter¬ 
gents even increase their activity. Based on these properties, 
they have found a number of applications, such as rapid 
in vitro lysis (13,41,42,44), removal of extracellular bacteria 
in eukaryotic cell invasion assays (58), selective release of 
intracellular metabolites such as ATP (59), and programmed 
self-destruction of intracellular Listeria cells with in the 
cytosol of macrophages (14). A novel approach for biolog¬ 
ical control of Listeria monocytogenes in fermented milk 
products is the production and secretion of N-terminally 
modified Ply 511 by recombinant, lactose-utilizing Lactococ- 
cus lactis (18). 

Another very interesting possibility is the use of the 
recombinantly produced cell wall binding domains of these 
enzymes (see above). The high-affinity, specific CBD poly¬ 
peptides can be used for immobilization of host cells on 
solid surfaces such as coated microplates or magnetic 
beads. When fused to a fluorescent label, they have proper¬ 
ties similar to a labeled antibody and allow specific decora¬ 
tion and efficient separation of Listeria cells from mixed 
bacterial populations and even within infected bacterial 
cells (30, 37). 

Integration Vectors 

On the basis of cloned integrase genes, two chromosomal 
integration vectors have been constructed. The first uses 
the integrase and phage attachment site of the A118-related 
phage, U153, and the second uses the analogous elements 
of phage PSA (29). These vectors are propagated in E. coli 


and can be transferred to L. monocytogenes by conjugation 
or by electroporation. Since these plasmids cannot repli¬ 
cate in L. monocytogenes, retention of the drug-resistant 
phenotype requires chromosomal integration. Using the 
U153int-based vector to integrate the genes for Listeriolysin 
0 (LLO) and ActA, Lauer and coworkers showed that these 
genes are expressed well from the comK attachment site 
(29). Using the PSA vector, it has been shown that the secA2 
gene can be expressed well from the tRNA Arg attachment 
site (30). 

Transducing Phage 

A recent and important finding was the use of Listeria 
phage for generalized transduction of genetic material from 
one strain to others (25). Of particular use is the introduc¬ 
tion of marker-tagged mutations and associated phenotypes 
into a clean genetic background, which enables detailed 
genetic mapping and characterization of the mutation. 
The most useful phages for this purpose are P35 and U153 
(serovar 1/2 strains), and A500 (serovar 4 strains). Phages 
infecting L. innocua and L. ivanovii were not tested in this 
study, but it is conceivable that many of the temperate 
viruses infecting these species will also be generalized 
transducers. 
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Mycobacteriophages 

GRAHAM F. HATFULL 


M ycobacteriophages are viruses of the Mycobacteria. 

The interest in these phages derives in large part 
from the medical significance and biological idiosyncra¬ 
sies of their hosts. Mycobacteria are acid-fast staining bacte¬ 
ria with characteristic waxy cell walls that can be readily 
divided into two groups based on their growth rate: slow- 
growers such as Mycobacterium tuberculosis that have a 
doubling time of 24 hours and fast-growers such as 
Mycobacterium smegmatis with 3-4 hour doubling times 
(for reviews see 8, 32) Several mycobacterial species are 
important human and animal pathogens, with the most 
notorious being M. tuberculosis and Mycobacterium leprae, 
the causative agents of tuberculosis and leprosy, respectively 
(8). The extent of these diseases is alarming—M. tuberculosis 
is the leading cause of human mortality from a single 
infectious disease and the increased prevalence of multiple 
drug-resistant M. tuberculosis strains greatly complicates its 
treatment. A study of the mycobacteriophages offers poten¬ 
tial for the development of novel methodologies for the 
diagnosis, prevention, and treatment of these diseases as 
well as revealing interesting biological features of their 
unusual bacterial hosts (30, 32, 33). 

Our knowledge of the composition and diversity of 
the huge global population of bacteriophages, estimated 
at 10 31 particles (11, 83), is meager and we are just begin¬ 
ning to learn how the mycobacteriophages are related to 
other phages in the population. While there is no reason 
to suppose that they are atypical, mycobacteriophages 
presumably have strategies for infection through complex 
mycobacterial cell walls, they must be adapted for growth 
in slow-growing hosts, and they could conceivably be 
involved in mycobacterial pathogenicity. Elucidating these 
features as well as numerous additional aspects of viral- 
host intimacy—DNA integration, gene expression, adsorp¬ 
tion, lysis, etc.—suggests that mycobacteriophages offer a 
fruitful field for exploration. 

Mycobacteriophages were first investigated in the 1950s, 
prompted by their utility in phage-typing of clinical myco¬ 
bacterial isolates (68, 76). Over 200 different mycobacterio¬ 
phages infecting a broad variety of mycobacterial hosts 


have been described, some with quite narrow host ranges 
(such as DS6A that infects only members of the M. tuber¬ 
culosis complex) and others that infect a wide variety of 
slow- and fast-growing strains (33). Until the late 1980s, 
rather few had been studied in molecular detail, although 
significant progress has been made since and there are 
several helpful reviews (29-31, 33, 51). In this chapter, I 
will focus on recent studies, most notably in the burgeon¬ 
ing area of bacteriophage genomics. 

Mycobacteriophage Genomics 

The development of high-throughput DNA sequencing 
technologies has made it possible to determine readily 
the complete genome sequences of bacteria and bacterio¬ 
phages. The first mycobacteriophage genome to be se¬ 
quenced was that of L5 in 1993 (34) followed by D29 and 
TM4 in 1998 (22, 23) and Bxbl in 2000 (53); more recently, 
the genomes of an additional 10 mycobacteriophages (Barn¬ 
yard, Bxzl, Bxz2, Che8, Che9c, Che9d, Corndog, Cjwl, 
Omega, and Rosebush) have been determined (57). Maps 
of the phage genomes are available at the web site 
www.thebacteriophages.org, All these phages infect M. 
smegmatis although many also infect other mycobacterial 
species including slow-growers such as BCG and M. tubercu¬ 
losis; partial sequence information is available for mycobac¬ 
teriophages DS6A and Ms6 (24, 25). Genome sequences of 
several mycobacterial hosts have also been determined 
including two strains of M. tuberculosis (H37Rv and 
CDC1551) and M. leprae (13, 14, 21); sequencing of the M. 
smegmatis, Mycobacterium avium, Mycobacterium bovis, 
Mycobacterium marinum, and BCG genomes is in progress 
with completion expected soon. 

Overview and Classification of 

Sequenced Mycobacteriophages 

Comparative analysis of a group of genomes is awkward 
without some general sense of their overall relatedness. 
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Figure 38-1 Mycobacteriophage genomic features. A: DNA sequence similarity among 14 completely sequenced 
mycobacteriophage genomes. B: Relationship between %GC content and genome length for all double stranded DNA phage 
genomes smaller than 100 000 base pairs. Phages are as follows: mycobacteriophages (■), E. coli and Salmonella phages (A), 
dairy phages (♦), and others (•). 


A hierarchical organization would be helpful but is proble¬ 
matic with double-stranded DNA phages in general and 
with the mycobacteriophages specifically. These difficul¬ 
ties are illustrated by the following general features of the 
mycobacteriophage genomes. First, at the nucleotide level 
they are quite diverse (figure 38-A). For example, some 
phages have significant levels of nucleotide similarity, with 
phages L5 and D29 being the most closely related pair; both 
phage Bxz2—and to a lesser extent phage Bxbl—also have 
nucleotide similarity to these phages. Likewise, phages 
Che9d and Che8 share some modest segments of nucleo¬ 
tide sequence similarity. In contrast, other phages—such as 
Barnyard, Rosebush, Che9c, Omega, Bxzl, Cjwl, and Corn- 
dog— have no extensive nucleotide sequence similarity to 
the other phages (figure 38-A). 

Secondly, there is a substantial variation in genome 
length, although they are generally larger than other 
double-stranded DNA tailed phage genomes (table 38-1). 
The reason for this is not clear but a correlation between 
the GC% content and phage genome length has been noted 
(figure 38-1B) (57). The relatively large genome size can 
thus be considered in the context of their high GC%, rather 
than reflecting the necessity for additional functions 
needed to propagate within mycobacterial hosts. 

Thirdly, there is no simple distinction between tempe¬ 
rate and lytic viruses and examination of plaque morpholo¬ 
gies does not provide a clear distinction between these 
classes. Some phages, such as L5, Bxbl, and Bxz2, form 
obviously turbid plaques from which stable lysogens can be 
isolated: these are indisputably temperate. Other phages, 


such as D29, form clear plaques from which lysogens 
cannot be recovered (22). While D29 is thus a lytic phage, 
this tells us little about its genome structure, and the 
sequence of D29 reveals that it is simply a recent deriva¬ 
tive of an L5-like temperate parent (22). Most of the other 
phages form plaques that are lightly turbid or hazy in 
appearance, and we favor the interpretation that while 
perhaps competent to form lysogens (or pseudolysogens), 
either the establishment or maintenance of lysogeny is 
poor under laboratory conditions with the bacterial hosts 
employed. For example, preliminary observations suggest 
that phages Che8 and Cjwl can both form M. smegmatis 
lysogens under certain conditions. 

Fourthly, there is no obvious distinction between 
phages that use cos- and pnc-type packaging systems as 
described for the dairy phages (10). Eleven of the 14 myco¬ 
bacteriophages contain genomes with defined termini with 
short single-stranded DNA cohesive termini (table 38-1). 
However, we have not been able to identify defined genomic 
ends in Bxzl, Rosebush or Barnyard and we presume that 
they have circularly permuted genomes. Phage Bxzl is 
curious in that it contains a region with a large stretch 
(>60) of G residues through which sequencing reactions 
fail, although this does not represent genome ends. Bxzl 
has been shown to be a generalized transducing phage 
consistent with a pac-type packaging system: it is not 
known whether phages Rosebush and Barnyard are also 
generalized transducers 

Fifth, there is considerable variation in the viral mor¬ 
phologies. Bxzl is the only phage in this group with 
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Table 38-1 General Features of 14 Mycobacteriophages and their Genomes 


Phage 

Genome 
size (bp) 

G+C% 

tRNAs 

(no) 

ORFS 

(no) 

Av. ORFsize 
(bp) 

Novel 

genes No. (%) 

Genome ends 

(no. of bases, polarity) 

Flead 
size (nm) 

Tail 

length (nm) 

L5 

52,297 

62.3 

3 

87 

601 

14 (16.1) 

9 base, 3' 

~60 

140 

D29 

49,136 

63.5 

5 

77 

638 

10 (13.0) 

9 base, 3' 

~60 

140 

TM4 

52,797 

68.1 

0 

92 

574 

53 (57.6) 

10 base, 3' 

~60 

200 

Bxbl 

50,550 

63.7 

0 

86 

588 

29 (33.7) 

9 base, 3' 

~60 

130 

Bxzl 

156,102 

64.8 

28 

225 

694 

164 (72.9) 

Circular. Permuted 

~80 

78 

Che8 

59,471 

61.3 

0 

112 

531 

20 (17.9) 

10 base, 3' 

~60 

186 

Bxz2 

50,913 

64.2 

3 

86 

599 

22 (25.6) 

10 base, 3' 

~60 

140 

Cjwl 

75,931 

63.1 

1 

141 

546 

77 (54.6) 

9 base, 3' 

~60 

250 

Corndog 

69,777 

65.4 

0 

122 

572 

63 (51.6) 

9 base 3' 

23 x 138 

265 

Che9c 

57,050 

65.4 

0 

84 

671 

39 (45.3) 

10 base, 3' 

35x80 

165 

Omega 

110,857 

61.4 

2 

237 

466 

134 (56.5) 

4 base, 3' 

~60 

205 

Che9d 

56,276 

60.9 

0 

111 

507 

38 (34.2) 

10 base, 3' 

~60 

130 

Barnyard 

70,797 

57.3 

0 

109 

650 

93 (85.3) 

Circular permuted 

~60 

265 

Rosebush 

67,480 

69.0 

0 

90 

750 

65 (72.2) 

Circular permuted 

~60 

245 

Total 

979,434 


42 

1559 


821 (52.7) 




Average 

69,960 

63.6 

3 

118.6 

599 

58.6 
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a contractile tail and all the others have long flexible tails 
of varying lengths (table 38-1). Twelve of the phages have 
isometric heads, all of which are approximately 60 nm in 
diameter except for Bxzl, which is about 80 nm in diameter. 
Two of the phages—Che9c and Corndog—have unusual 
prolate heads (see chapter 2 for a discussion of phage 
classification and morphology). 

Lastly, no hierarchical relationships among these 
phages can be reconstructed using protein sequence simi¬ 
larities since the genomes are highly mosaic (see below for 
further discussion). Thus, although Bxzl may superficially 
appear to be different from the other phages in that it is 
substantially larger (156 kbp) and apparently differs in its 
genome organization, it cannot be easily taxonomically 
separated from the others. For example, it contains more 
genes in common with phage Corndog than are shared 
between phages Che9c and L5 (57). Thus, neither morphol¬ 
ogy, genome length, packaging style, life-style nor sequence 
similarity offers a simple classification scheme for these 
phages. 

Architectural Features of 
Mycobacteriophage Genomes 

There is no single architectural design unifying all 14 
mycobacteriophage genomes although common features 
are apparent. For example, all the genomes are replete with 
protein-coding genes with few noncoding spaces. Large 
groups of genes are usually transcribed together in the 
same direction and are presumably cotranscribed from a 
relatively small number of phage promoters. However, there 
is no common transcriptional organization shared by all 
the phages (figure 38-2). With the exception of phage Bxzl, a 
cluster of genes (encompassing approx. 20 kbp) involved 
in virion structure and assembly can be recognized which 
are located in the left part of the genomes; this cluster 
is usually further divided into the DNA packaging func¬ 
tions (terminase), head genes, and tail genes, (figure 38-2). 
In four of the phages (TM4, Che8, Che9c, and Che9d), the 
terminase gene is located within lkbp of the genome 
termini—the prototypical arrangement seen in phage 
X — but in general these parts of the genomes are quite vari¬ 
able and additional genes are often present. For example, 
four of the genomes (L5, D29, Bxbl, Bxz2) have lysis genes 
in this region rather than downstream of the structural 
gene cluster (as in Corndog, Che8, Che9c, Che9d, TM4, 
Cjwl, Barnyard, Omega, and Rosebush); in phage Corndog 
the terminase gene and genome left end are separated 
by over 13 kbp and there are about 30 genes in this interval 
(figure 38-2). 

Nine of the phages (L5, D29, Bxbl, Bxz2, Che8, Che9c, 
Che9d, Cjwl, Omega) encode an integrase, and in each case 
the gene is situated close to the center of the genome; where 
known, the attP site is adjacent to int. The distance between 
the left end of the integrase gene and the left end of 


the genome varies from 46.3% (phage L5) to 58.4% (phage 
Bxbl) even though there is more than a 2-fold difference 
in total genome length among the integrase-encoding 
genomes (table 38-1). The reason why the attP/integrase 
cassette should be centrally located—giving rise to simi¬ 
larly sized left and right arms—is not clear. We note, 
however, that the region between the int and the structural 
genes (or the lysis genes if they are in this location) is vari¬ 
able in size and composition (figure 38-2). In L5 they are 
almost adjacent and in Omega this interval is >9 kbp 
and contains approximately 30 genes, mostly of unknown 
function. 

The organization and numbers of genes in the right 
parts of the genomes (to the right of the structural, lysis, 
and integration genes) varies greatly. In phage D29 this 
segment is approximately 22 kbp and includes about 60 
genes; in contrast, phage Omega has more than 130 genes 
in a span of over 55 kbp, more than the entire D29 genome! 
Typically these regions contain a preponderance of small 
open reading frames, most of which do not match previously 
described genes although genes involved in DNA metab¬ 
olism, replication, recombination, and regulation can be 
recognized (see below). 

Phage-Encoded tRNA Genes 

Six of the sequenced mycobacteriophages encode tRNA 
genes (table 38-1). In phages L5, D29, and Bxz2 there is a 
small cluster of tRNA genes located approximately 4 kbp 
from the left genome end, upstream of the terminase genes. 
Phages Omega and Cjwl each contain one standard tRNA 
gene close to the right end of the genome, plus one nonstan¬ 
dard tRNA gene; the nonstandard Omega tRNA has eight 
bases in the anticodon loop, suggesting it may be a frame- 
shift suppressor. Bxzl contains the largest number of tRNA 
genes (twenty-eight) arranged in two large, loosely orga¬ 
nized clusters, of which 27 are typical and correspond to 16 
different amino acid specificities; this is the largest set of 
tRNA genes identified in any viral genome. The one atypical 
tRNA has a 5'-CUA anticodon and could function as a 
suppressor of 5'-UAG stop codons, raising the question of 
whether UAG is used as a stop codon. Thirty-five of the 
231 open reading frames terminate with UAG and al¬ 
though many of these are followed by short noncoding 
segments, alternative stop codons, or adjacent downstream 
coding regions in the same frame, this is not true for all these 
genes. Thus, while the putative suppressor tRNA could regu¬ 
late the expression of specific genes, it may not confer a 
wholesale modification of the Bxzl-phage genetic code. 

The role of the mycobacteriophage-encoded tRNAs is 
not known, although Kunisawa noted that phage D29 
codons corresponding to the D29 tRNAs have higher rela¬ 
tive usage than they do in the host (43). He concludes that 
the host (M. tuberculosis) has a paucity of the cognate 
tRNA for these codons and that the phage compensates by 
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Figure 38-2 Mycobacteriophage genomic architectures. The genome of each of the 14 completely seguenced 
mycobacteriophage genomes is represented as a thick horizontal line with arrows indicating the direction of transcription. 
The locations of genes involved in integration, lysis, the head and tail genes, and packaging (terminase) are indicated. 


carrying additional copies of the appropriate tRNA genes. 
However, no similar correlation is seen in Bxzl, where 
less than one half of the 27 Bxzl tRNAs correspond to 
codons that are overabundant in the phage (relative to M. 
tuberculosis). Either this explanation is not generally applic¬ 
able or the codon usage of the preferred host for Bxzl is 
substantially different from that of M. tuberculosis. It is note¬ 
worthy that Bxzl does not efficiently infect slow-growing 
mycobacterial strains and the codon frequencies in M. 
smegmatis have yet to be evaluated; phage D29 does, how¬ 
ever, infect both fast- and slow-growing strains. 


Virion Structure and Assembly 

Capsid Proteins 

N-terminal sequence analysis of phage L5 virion proteins 
shows that the capsid subunit is encoded by gene 17 and 
covalently crosslinked in the mature capsid, similarly to 
the coliphage I IK97 capsid (34, 65); D29 gpl7, Bxbl gpl4, 
Bxz2 gpl7, Che9d gp7, and TM4 gp9 all share signifi¬ 
cant sequence similarity with L5 gpl7 (table 38-2). While 
none of these phages are close relatives of the phage HK97, 
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Table 38-2 Mycobacteriophage Capsid Genes 


Phage 

Capsid 

Size 

(amino acids) 

Capsid homologs 

Other features 

L5 

gp17 

327 

D29 gp17, Bxbl gp14, Bxz2 gp17, 

A118 gp6, Che9d gp7, TM4 gp9 

Member of the HK97 gp5 family of capsid proteins 

D29 

gp17 

319 

L5 gp17, Bxbl gp14, Bxz2 gp17, 

All8 gp6, Che9d gp7, TM4 gp9 

Member of the HK97 gp5 family of capsid proteins 

Bxbl 

gpi4 

398 

L5 gp17, D29 gp17, Bxz2 gp17, 

A118 gp6, Che9d gp7, TM4 gp9 

Member of the HK97 gp5 family of capsid proteins. 

180 amino acid C-terminal. extension at ends of Bxbl 
gp19, Rosebush gpl 5, and gp21. In middle of Bxbl 
gp23, Bxzl g124, Omega gp35, and Bxz2 gp27 

Bxz2 

gp17 

331 

Bxbl gp14, L5 gp17, D29 gp17, 

All8 gp6, TM4 gp9, Che9d gp7 

Member of the HK97 gp5 family of capsid proteins 

TM4 

gp9 

306 

All8 gp6, Bxz2 gp17, D29 gp17, 

L5 gp17, Bxbl gp14 

Member of the HK97 gp5 family of capsid proteins 

Che9d 

gp7 

312 

Rlt gp31, Bxz2 gp17, D29 gp17, 

L5 gp17, Bxbl gp14 

Member of the HK97 gp5 family of capsid proteins 

Cjwl 

gp12 

498 

BFK20 gp6, HK97 gp5, HK022 gp5, 

D3 gp6 

Member of the HK97 gp5 family of capsid proteins 

Omega 

gp15 

478 

HK97 gp5, HK022 gp5, blL285 gp44, 
(Bxbl gp13?) 

Member of the HK97 gp5 family of capsid proteins 

Che9c 

gp6 

544 

Rv1576c, Rv2650c, AgrC capsid, 
Caulobacter hyp., M. loti gp36, 

H. flu hyp.,SfV capsid, VWB gp2 

Member of the HK97 gp5 family of capsid proteins 

Che8 

gp6? 

274 

DRA0099 

PSI-BLAST connection to T3/T7 capsid subunits 

Corndog 

gp41? 

290 

HI 0445/HI 0446 

HI0445/HI0446 is a homolog of 4>C31 gp36 

Rosebush 

Bxzl 

Barnyard 

gp15? 

?? 

?? 

676 


gpl 5 has C-terminal. extension of Bxbl gp14 and gpl 9, 
and Rosebush gp21. No homologs. 

Capsid gene identity uncertain; no homologs 

Capsid gene identity uncertain; no homologs 


PSI-BLAST analysis reveals them all to be members of the 
HK97 gp5 capsid-protein superfamily. Phage Cjwl gpl2 
and phage Omega gpl5 are more closely related to the 
phage HK97 gp5, and phage Cjwl gpl2 shares 23% identity 
with HK97 gp5. Interestingly, the prolate-headed phage 
Che9c encodes a protein (gp6) with strong sequence similar¬ 
ity to proteins Rvl576c and Rv2650c of the M. tuberculosis 
prophage-like elements <f>Rvl and 4>Rv2, respectively, and 
all three can be drawn into the HK97 gp5 family by PSI- 
BLAST analysis. Particles corresponding to the <f>Rvl and 
4>Rv2 elements have yet to be identified but these observa¬ 
tions raise the question as to whether these might also 
form prolate structures similar to phage Che9c heads. 

The major capsid subunit gene is harder to identify in 
phages Che8, Corndog, Rosebush, Bxzl, and Barnyard. In 
phage Che8, a likely candidate is gene 6, whose product 
is related (26% identity) to the DRA0099 protein that is 
encoded by a prophage element in Deinococcus radio- 
durans ; PSI-BLAST analysis links DRA0099 and Che8 gp6 
to capsid proteins of phages T3, T7, and cyanophage P60 
(table 38-2). In Corndog, gp41 is a likely capsid subunit 
since it is related to a gene product in a genetic island in 
Haemophilus influenzae that in turn is related to phage (DC 31 
gp36, a known member of the HK97 gp5 superfamily. The 
capsid subunits of phages Bxzl and Barnyard have yet to be 
identified, although the Barnyard gene is most likely one 


of several (18-28) upstream of the tape measure gene. In 
phage Rosebush, gene 15 is a candidate capsid subunit 
gene for reasons that are described below. 


The Unusual Capsid Subunit of Bxb 1 

Phage Bxbl shares considerable overall similarity with 
phages L5 and D29, especially within the virion structure 
and assembly gene cluster. The major capsid subunits of 
Bxbl (gpl4), L5 (gpl7) and D29 (gpl7) are closely related; 
L5 gpl7 and D29 gpl7 share 82% identity, Bxbl gpl4 and 
L5 gpl7 share 69% identity, and Bxbl gpl4 and D29 share 
72% identity (figure 38-3). However, while L5 gpl7 and 
D29 gpl7 are similar lengths (326 and 318 amino acids, 
respectively), Bxbl gpl4 is larger (397 amino acids) due to 
a C-terminal extension of approximately 85 residues. Since 
the covalently crosslinked pentameric and hexameric 
forms of the Bxbl capsid migrate more slowly during SDS- 
PAGE than those of L5 (and D29), it seems unlikely that this 
C-terminal appendage is cleaved off during assembly (53). 
Moreover, the role of the C-terminal extension is not 
obvious since clearly it is not required for L5 or D29 
assembly. We note that the Bxz2 capsid subunit (gpl7) 
is also closely related (56% identity with Bxbl gpl4) but 
lacks the C-terminal extension. 
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Figure 38-3 The unusual structures of the Bxbl capsid and major tail proteins. A: The capsid (major head subunit) of Bxbl 
is similar to L5 and D29 capsids but contains a C-terminal extension that is related to a similar extension at the C-terminus 
of the Bxbl major tail subunit. A similar element is also present at the C-termini of Rosebush gpl 5 and gp21; these may 
correspond to the major head and tail subunits of Rosebush, but the large N-terminal regions do not have database matches. 
Related segments are also found in a putative Bxzl structural protein (gpl 24) and in the middle of both Bxbl gp23 and 
Omega gp35 (not shown). The degree of amino acid seguence identity is shown for segments of vertically adjacent pairs. 
B: Amino acid sequence alignment of C-terminal extensions of Bxbl proteins. Sequences were aligned using ClustalX. 
Residues present in all sequences are indicated by an asterisk above the alignment. Amino acid segments shown are as 
follows: Bxz2 gp27, 267-342; Bxbl gp23, 393-448; Bxzl gp!24, 55-119; Omega gp35, 250-302; Bxbl gpl4, 303-397; Bxbl 
gpl 9, 190-283; Rosebush gpl 5, 580-675; Rosebush gp21, 260-359. The function of this sequence motif is not known. 


The extension at the C-terminus of Bxbl gpl4 is curious 
since a similar sequence is also present at the C-terminus 
of the Bxbl major tail subunit, gpl9 (figure 38-3). Bxbl gpl9 
is a close relative of the major tail subunits of phages Bxz2 
(gp23), D29 (gp23), and L5 (gp23), sharing 64%, 61%, and 
60% identity respectively (figure 38-3). However, as with 
the capsid subunit, Bxbl gpl9 is longer than its counter¬ 
parts in L5, Bxz2, and D29 due to an extension of about 85 
residues at the C-terminus. Interestingly, this 85-residue 
segment is related to that at the end of Bxbl gpl4, with the 
two sharing 47% identity (figure 38-3). As with the capsid 
subunit, the Bxbl major tail subunit is not processed (53). 
Thus, each Bxbl particle is expected to contain about 600 
copies of this motif, 415 in the capsid and about 200 in 
the tail! 

This motif is quite promiscuous and is found in several 
other locations. For example, there is at least one more 
distantly related copy of this motif in the middle of putative 
minor tail proteins, Bxbl gp23 and Bxz2 gp27; Bxzl also 


contains a related sequence near the N-terminus of gpl24, a 
putative structural protein. However, an especially striking 
relationship is in mycobacteriophage Rosebush, in which 
two open reading frames (15 and 21) each contain a copy of 
this motif at their C-termini (figure 38-3). The functions of 
these two proteins are not known but we speculate that 
they correspond to the capsid and major tail subunits, 
respectively. 

Scaffold Proteins 

Although the capsid proteins of coliphage HK97 and myco¬ 
bacteriophage L5 are members of the same family of pro¬ 
teins, they differ in that HK97 gp5 contains a 105-residue 
N-terminal domain that plays a role in capsid assembly but 
which is proteolytically removed during capsid maturation. 
In contrast, L5 encodes a separate scaffold protein, gpl6, 
which is a component of head-like particles in infected 
cells but absent from mature heads. Phages D29, Bxbl. and 
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Bxz2 — whose structural proteins are close relatives 
of L5 — also contain similar scaffold protein genes located 
immediately upstream of the capsid subunit gene. In phage 
TM4, the capsid subunit (gp9) is a similar size to L5 gpl7 
(305 versus 326 amino acids) and likely also has a sepa¬ 
rately encoded scaffold protein. While the upstream gene 
(gp8) is not obviously related to the L5/D29/Bxbl/Bxz2 
scaffold proteins, PSI-BLAST analysis suggests that these 
are all members of a larger family of assembly proteins; 
mycobacteriophage Che8 gp5 and Che9d gp6 as well the 
lactococcal phage rlt gp30 also appear to be members 
of this group. In contrast, the putative capsid proteins 
of Che8 (gp6), Cjwl (gpl2), Corndog (gp41), Omega (gpl5), 
and Che9c (gp6) are generally longer than L5 gpl7 and 
do not have a scaffold-like gene immediately upstream; 
it seems probable, therefore, that these putative capsid 
proteins are structurally more similar to HK97 gp5 and 
utilize an N-terminal domain to promote capsid assembly. 

Portals and Proteases 

Genes encoding protease proteins involved in capsid 
assembly can also be identified in some of the myco- 
bacteriophages, typically immediately upstream of the 
capsid assembly genes. For example, Che9c gp5, Omega 
gpl4, and Corndog gp39 are weakly related to other phage 
proteases (e.g., 4>C31 gp35) as are proteins Rvl577c and 
Rv2651c encoded by the M. tuberculosis prophage-like 
elements c()Rvl and (j>Rv2. Other protease candidates are L5, 
gpl5, D29 gpl5, Bxz2 gpl5, Bxbl gpl2, Che8 gp4, Cjwl pll, 
Che9d gp5, andTM4 gp6, all of which are encoded between 
the capsid assembly and upstream putative portal genes. 
Putative portal proteins such as Omega gpl3, Cjwl gpl(), 
Che 9 c gp4, and Corndog gp34 share sequence similarity 
with a large group of other phage portals (including HK97 
gp3). L5 gpl4, D29 gpl4, Bxz2 gpl4, Bxbl gpll, Che9d 
gp4, TM4 gp5, Che8 gp3 form a second group of portal 
proteins that are related to 4>31 gp 5 and rlt gp27. 

Tails 

As noted above, all the sequenced mycobacteriophages— 
excepting Bxzl—have long, noncontractile tails. Tail length 
varies considerably from approximately 140 nni (L5) to 
about 260 nm (Rosebush) but in most cases the tails have 
a defined structure at the tip and a tail shaft composed 
of a number of rings. Side tail fibers have not been observed 
on any of these phages, although some (e.g., L5 and D29) 
have a visible spike protruding from the tail tip. 

The visible shaft of these noncontractile tails is likely 
composed of rings of the major tail subunit. The genes 
encoding these proteins have been identified in several of 
the phages and they appear to fall into two main sequence 
groups: one group contains L5 gp23, D29 gp23, Bxbl 
gpl9, Bxz2 gp23, TM4 gpl4, and Che9d gpl4; the other 


contains Che9c gpl2, Corndog gp49, Che8 gpll, Cjwl 
gpl8, and Omega gp31. The first group is related to virion 
subunits (presumably also major tail subunits) of some 
non-mycobacteriophage viruses such as lactococcal phage 
rlt (gp37) (79) and staphylococcal phage VWB (gp27) (1). 
One notable feature of these proteins is that they typically 
migrate somewhat slower than their predicted molecular 
weight in SDS polyacrylamide gel electrophoresis, an obser¬ 
vation that has also been noted for the major tail subunit 
of phage X. 

N-terminal sequencing has identified several genes 
encoding minor tail subunits in phage L5 including the 
products of genes 6, 26, 27, and 28 (34). Identifying gene 
6 as a tail protein was somewhat surprising since it is 
encoded upstream of terminase and other structural genes; 
D29 gp6 and Bxz2 gp3 are homolog of L5 gp6 and encoded 
in a similar place in the genome. The product of L5 gene 26 
is almost certainly the phage tape measure protein and 
is the largest gene in the L5 genome. L5 gp26 may be 
C-terminally processed since while the N-terminal sequence 
corresponds to the predicted start of the gene, the protein 
migrates significantly faster than its predicted molecular 
weight of 86 kDa. The two additional minor tail proteins, 
gp27 and gp28, may be tail tip components. 


Programmed Translational Frameshifting 
in Tail Cenes 

It has been shown previously that phage X encodes two 
gene products from the G and T genes (47). One is the 
expected gpG protein, the other (gpG-T, which is made at 
about 4% the amount of gpG) is expressed via a programmed 
— 1 translational frameshift near the end of the G gene. 
Curiously, this frameshifting phenomenon within genes 
encoded just upstream of the tape measure protein is an 
extremely common feature among the double-stranded 
DNA tailed phages (84), although in some phages (e.g., Mu) 
it is a —2 frameshift (54). Genes expressed by similar frame- 
shifting events can be identified in many of the mycobac¬ 
teriophages, and in mycobacteriophage L5 the expression 
of gp24-25 via a —1 frameshift within the end of gene 24 
has been experimentally demonstrated (84). It is predicted 
that phages D29, Bxbl, TM4, Bxz2, Bxzl, and Cjwl also 
express tail genes via a —1 frameshift (table 38-3) and, with 
the exception of Bxzl, the encoded proteins are related, 
albeit weakly in some cases. Interestingly, in phages Che 8, 
Corndog, Che9c, and Omega, genes in the same location are 
also predicted to be expressed via frameshifting, but in 
these cases a —2 frameshift is involved (table 38-3). More¬ 
over, while phages Cjwl 20 and 21 utilize a —1 frameshift, 
they are more closely related at the sequence level to 
the Che8/Corndog/Che9c/Omega group, suggesting that a 
switch between —1 and —2 frameshifting occurred relatively 
recently in their evolution. 
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Table 38-3 Frameshifts in Mycobacteriophage Tail Protein Genes 


Phage 


“G” 


“T” 

Shift 

Gene 

Homologs 

Gene 

Homologs 

L5 

24 

D29 gp24; Bxz2 gp24; Bxbl gp20 

25 

D29 gp25; Bxz2 gp25; Bxbl gp21; TM4 gp16 

-1 

D29 

24 

L5 gp24; Bxbl gp20; Bxz2 gp24 

25 

L5 gp25; Bxz2 gp25; Bxbl gp21 

-1 

Bxbl 

20 

Bxz2 gp24; D29 gp24; L5 gp24 

21 

D29 gp25; L5 gp25; Bxz2 gp25 

-1 

Bxz2 

24 

Bxbl gp20; D29 gp24; L5 gp24 

25 

Bxbl gp21; D29 gp25; L5 gp25 

-1 

TM4 

15 

None 

16 

L5gp25 

-1 

Bxzl 

127 

None 

128 

None 

-1 

Cjwl 

20 

Che8 gp12 

21 

Che8 gp13 

-1 

Che8 

12 

Corndog gp54; Omega gp32; Cjwl gp20 

13 

Corndog gp55; Omega gp33 

-2 

Corndog 

54 

Che8 gp12; Omega gp32; Cjwl gp20; Che9c gp13 

55 

Che8 gp13; Omega gp33; Cjwl gp21 

-2 

Che9c 

13 

Omega gp32; Cjwl gp20; Corndog gp54 

14 

None 

-2 

Omega 

32 

Corndog gp54; Cjwl gp20; Che8 gp12; Che9c gp13 

33 

Corndog gp55; Che8 gp13 

-2 


“G” and “T” refer to the G and T genes of phage A,. 


Tail Tape Measure Proteins 

The genes encoding the mycobacteriophage tape measure 
proteins are easy to identify since they are typically the 
largest genes in the genome and the length of tape measure 
gene correlates with phage tail length (42). This relation¬ 
ship is also observed in the 13 mycobacteriophages with 
noncontractile tails although the tail length and tape 
measure genes vary considerably: ~140nm (e.g., L5, D29, 
Bxbl) to over 260 nm (e.g., Corndog, Barnyard) and the 
corresponding genes from approximately 2.5 to 6 kbp. In a 
few cases (e.g., Barnyard, Omega, Che8) the gene is some¬ 
what larger than would be predicted from the tail length 
and it is plausible that these proteins are proteolyti- 
cally processed prior to their role in tail length determina¬ 
tion (57). 

The mycobacteriophage tape measure proteins have a 
generally similar composition that is high in alanine and 
glycine (combined amounts of 23% in L5 to 29.7% in Che8), 
and none have more than a single cysteine residue: the 
predicted pi is typically close to 8.0, ranging from 7.7 in 
Che9d to 8.3 in D29 and Rosebush. These proteins, however, 
are quite varied in their amino acid sequences and are 
related to each other in complex ways. The most closely 
related set of proteins are L5 gp26, D29 gp26, Bxbl gp22, 
and Bxb2 gp26, which can be aligned over their entire 
lengths (pairwise identities range between 26% and 73%). 
However, a central portion of these proteins (~450 amino 
acids) also has similarity to a group of proteins encoded 
by streptococcal phages Sfill and 01205, and lactococcal 
phage rlt. In a more extreme case, Corndog gp 57 has patches 
of similarity to a number of other mycobacteriophage tape 
measure proteins as follows: Omega gp34 over residues 
11-472 (26%), 795-951 (28%), and 1337-1387 (40%); Che9c 
gpl5 over residues 89-171 (33%), 763-954 (32%), and 
1337-1411 (34%); Che8 gpl4 over residues 719-954 (27%) 
and 1193-1390 (34%); Rosebush gp29 over residues 
232-451 (26%); Barnyard gp33 over residues 223-505 


(19%); Bxz2 gp26 over residues 237-428 (22%); Cjwl gp22 
over residues 733-904 (20%); and L5 gp26 and D29 gp26 
over residues 295-428 (21%). 

A particularly intriguing observation is the finding that 
the tape measure protein of Barnyard (gp33) includes a 
sequence motif related to bacterial cytokines (57). Previously 
it has been shown that Micrococcus luteus secretes a cyto¬ 
kine or resuscitation factor (Rpf) that promotes growth of 
dormant M. luteus cells (55); Mycobacterium tuberculosis 
contains five homologous genes of Rpf (Rvl009, Rv0867c, 
Rv2389c, Rv2450c, and Rvl884c) and there are three 
homologs in M. leprae, four in Streptomyces coelicolor, and 
two in Corynebacterium glutamicum. While these various 
proteins differ in size, they all share a well-conserved central 
segment of approximately 90 amino acids to which residues 
1490-1584 of Barnyard gp33 are similar; the most closely 
related protein is Rvl009, which shares 61% identity over 
a 90-residue segment of Barnyard gp33. This sequence rela¬ 
tionship strongly suggests a function of the tape measure 
protein that was not previously recognized—the ability 
to stimulate dormant host bacterial cells into a growing 
state to support successful phage infection. 

Upon reflection, this activity should perhaps not have 
been unexpected. In their natural environments it 
is likely that most bacterial cells are dormant or in a 
nongrowing state. This poses a challenge to phages 
since these hosts are not expected to support a produc¬ 
tive phage infection. The tape measure protein seems an 
ideal location to place this viral alarm clock since it is one 
of few phage proteins that must presumably pass through 
the membrane and into the cell prior to DNA injection 
through the tail. While little is known about how Rpf- 
mediated signaling works, the tape measure protein is well 
placed to mimic the resuscitation factor whether the signal¬ 
ing receptor is membrane-bound or cytoplasmic. Moreover, 
since the signaling motif is part of the phage particle, gene 
expression is not required to promote phage-mediated 
reawakening. 
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The ability to awake dormant bacterial hosts should 
confer a significant selective advantage to the virus and 
it might be expected to be a common viral strategy. While 
Barnyard is the only phage identified thus far that carries 
the Rpf motif, there are additional motifs within the 
tape measure proteins of other mycobacteriophages with 
similarity to other small, putatively secreted mycobacterial 
proteins. One of these motifs is found in the tape measure 
proteins of Che9c (gpl5), Cjwl (gp22), Rosebush (gp29), and 
Barnyard gp33, and is related to M. tuberculosis Rvlll5. A 
third motif is present in phage TM4 (gpl7), Che8 (gpl4), and 
Omega (gp34) tape measures and is related to the putative 
secreted M. tuberculosis proteins, Rv0320 and Rvl728c. The 
functions of these M. tuberculosis proteins are not known, 
but we predict that they are also involved in cell-cell 
communication. 

Imposters in Structural Gene Clusters 

Lambdoid phages contain morons, open reading frames 
flanked by a promoter and a stem-loop terminator that are 
present in one genome but absent from related genomes; 
morons can be inserted in either orientation relative to 
the direction of transcription of the structural genes and 
may be expressed in lysogens (36, 40) (morons are further 
discussed in chapter 27). Identifying morons in mycobac- 
teriophage genomes is tricky since mycobacterial promot¬ 
ers are less well defined and stem-loop terminators are rare. 
However, the structural gene cluster is a good place to search 
for both morons and gene insertions because of the well 
defined gene order. 

The genomes of L5, D29, Bxbl, Bxz2, TM4, Cjwl, Che8, 
Che9d, and Che9c have a canonical organization of the 
structural genes with the only major departure being the 
location of the L5 gp6 tail gene as noted above. In both 
Corndog and Omega there are notable departures; in 
Omega there are approximately 15 open reading frames 
between the head and tail genes—two of which are tran¬ 
scribed on the opposite strand—where it is usual to find 
four to six genes involved in head-tail connection. Ten of 
the 15 have no database matches and three match Corndog 
genes present in a similar location (probably involved in 
head-tail connection). Interestingly, the two genes (16 and 
17) immediately downstream of the capsid gene have simila¬ 
rities to glycosyl transferases and O-methyltransferases, 
respectively. This raises the intriguing possibility that the 
products of these genes are involved in post-translational 
modification of the phage particles. 

In Corndog, a different departure in the structural gene 
organization is seen. Typically the portal and capsid 
genes flank the protease and scaffold, but in Corndog there 
are six genes in this region (genes 35-40), one of which 
(gene 39) encodes a putative protease. Two of the others 
(genes 37 and 38) are homologs of Che8 gpl09 and gpllO, 
located at the right end of the Che8 genome, and Corndog 


gene 35 encodes an O-methyltransferase—a homolog of 
Che8 gpl08—perhaps suggesting that Corndog (as well as 
Che8) particles may also post-translationally modified 

Lysis Genes 

Release of phage particles at the end of a lytic cycle is depen¬ 
dent on phage-encoded lysis genes (80). Identification 
and characterization of the mycobacteriophage lysis genes 
is of interest not only for understanding the mechanism 
and timing of lysis but also because of the possible thera¬ 
peutic use of mycobacteriophage lysins; similar applications 
have shown promise for control of streptococcal and an¬ 
thrax infections (20,49,74). 

Putative lysis genes have been identified in mycobacterio¬ 
phage Ms6, where three genes have been implicated: lysA, 
ORF3, and hoi (25). Two of these, lysA and QRF3, have 
proposed enzymatic functions while hoi encodes a puta¬ 
tive holin (figure 38-4) (see chapter 10 for a discussion of 
the functional relationship between lys and hoi genes). 
When the Ms6 LysA enzyme is expressed in Escherichia coli 
the cells become sensitive to the addition of chloroform, 
consistent with the action of a lytic enzyme. There is no 
direct evidence for a lytic role of Ms 6 ORE 3 but it shares 
sequence similarity with other mycobacteriophage proteins 
that are implicated in this function (see below). The assign¬ 
ment of a putative holin function to the Ms6 lwl gene is 
supported by sequence similarity with the putative holin of 
lactococcal phage rlt and the ability of Ms6 hoi to comple¬ 
ment a 7, S mutant (25). 

Lysis genes can be identified in most of the 14 sequenced 
mycobacteriophage genomes (figure 38-4). The lysis genes 
of phage Che8—32, 33, and 34—are organized similarly to 
those in Ms6, corresponding to the lysA, ORE3, and 
hoi functions respectively; Che8 gp32 and gp33 share 57% 
and 88% amino acid sequence identity with Ms 6 LysA and 
ORF3 respectively, and Che8 gp34 and Ms6 Hoi are 98% 
identical over the first 66 of the 77 residues. However, 
phage Che8 is the only one that shares this organization 
with phage Ms6 (figure 38-4). 

All of the mycobacteriophages contain a lysA-like gene 
(figure 38-4). However, the sequence relationships among 
the gene products are complex and frequently only a 
small segment of protein pairs is related. For example, 
regions of the 424 residue Che8 gp32 (which is similar to 
Ms6 LysA over its entire length) are related to other myco¬ 
bacteriophage proteins as follows: 1-146 matches TM4 
gp29 (47%), 61-236 matches Bxzl gp236 (27%), 393-425 
matches Corndog gp69 (78%), and 337-421 matches 
Che9d gp35 (40%). There are no significant matches to 
other LysA-family members. A central segment of TM4 
gp29 matches parts of Barnyard gp39, Bxzl gp236, 
and Corndog gp69 as well as the M. tuberculosis protein, 
Rv3594. This part of these proteins corresponds to a domain 
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Figure 38-4 Mycobacteriophage lysis genes. With the 
exception of phage Rosebush, all the sequenced 
mycobacteriophage genomes along with the partially 
sequenced Ms6 genome contain two genes encoding 
putative lytic enzymes used for degradation of the cell wall 
at the completion of lytic growth. The putative lysA genes 
(light gray boxes) form a family of related sequences but the 
relationships are complex and not all pairs of genes have 
detectable sequence similarity. The lysB genes (dark gray 
boxes) are also implicated in lysis. Putative holin genes 
(diagonally striped boxes) can be identified in six of the 
phages, with L5 7 7, D29 7 7, and Cjwl 33 forming one set 
of related genes, Che8 34 and Ms6 4 another, and Bxzl 
gp237 a third. 


associated with N-acetylmuramoyl-L-alanine amidases, 
consistent with a function in cell lysis. Cjwl gp32 has 
a central segment related to M. tuberculosis Rv3766 
(which is of unknown function but shares a small region of 
similarity with Rv3594) and has an N-terminal segment 
similar to Omega gp50, which in turn has a central segment 
with a 1,4-jS-N-acetylmuramidase domain that is shared 
by many lysozymes. Preliminary data indicate that D29 
gplO, Che8 gp32, and Bxz2 gp236 function as lytic enzymes 
(T. Huang, L. Marinelli, and G. F. Hatfull, unpublished 


observations). This LysA family of proteins thus appears 
to be a particularly diverse and interesting group of lytic 
enzymes that warrant considerable further investigation. 

Phage Ms6 gp3, which lies between the lysA and hoi 
genes, may also be involved in host lysis (25). Interestingly, 
homologs of this protein are found adjacent—or very 
close — to lysA in all the other mycobacteriophages; the 
one exception is phage Rosebush, which does not possess 
a homologu (figure 38-4). In general, these proteins each 
contain a segment of up to 250 amino acid residues at 
the N-terminus that are related to each other; the C-termini 
are more varied among the group. Some of these proteins 
(TM4 gp30) have a peptidoglycan-binding motif at their 
extreme N-terminus consistent with a role in lysis. Further¬ 
more, preliminary data suggest that at least one of this 
class of proteins (Che8 gp33) exhibits lytic properties 
(T. Huang and G.F. Hatfull, unpublished observations). We 
therefore propose that these are designated as lysB genes. 

Putative holin genes can be identified in six of the 
mycobacteriophages (L5, D29, Che8, Ms6, Cjwl, and Bxzl) 
where they are adjacent or close to the other lysin genes. 
Ms6 gp4 (hoi) and its close relative, Che8 gp34, are putative 
class II holins (25). L5 and D29 both have a gene (gene 11) 
located between the two other putative lysin genes, 10 
and 12, and preliminary data support the function of 
gene 11 as a holin (Marinelli and G.F. Hatfull, unpublished 
observations); Cjwl gp33 is a homolog of L5 and D29 gpll. 
Preliminary data also support the function of the phage 
Bxzl gene 237 as a holin (Huang and G.F. Hatfull, unpub¬ 
lished observations). Holin genes have yet to be identified 
in the other mycobacteriophages. 


Genome Evolution 

The 14 completely sequenced mycobacteriophage genomes 
are a highly varied group. What are their evolutionary 
histories and what mechanisms gave rise to their present 
structures? The availability of a group of genomes for 
comparative analysis provides an opportunity to address 
these questions, but with such a highly diverse group— 
even with nearly 1 Mbp of total sequence information and 
over 1600 genes—that the dataset is still inadequate for any 
detailed reconstruction of evolutionary histories. 

In spite of these limitations, there are some striking 
features that reveal the dominant processes in bacterio¬ 
phage evolution. Perhaps the most obvious of these is the 
pervasive mosaicism, with each genome being composed 
of modules that are shared by one or more of the other 
phages. The “sharing" as we define it here for the most part 
means having significant sequence similarity of protein 
products since, with some notable exceptions, most of the 
genomes share little or no nucleotide sequence similarity. 
The evolutionary events (i.e., changes at the genomic level) 
that gave rise to the present structures likely occurred at 
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times far back in evolutionary time. What is most striking 
is that adjacent genomic modules (often corresponding to 
single genes) match modules in different mycobacterio- 
phages. As a consequence, each module has its own 


evolutionary history as illustrated in figure 38-5, and differ¬ 
ent modules have different phylogenetic relationships. There 
is. therefore, no single phylogenetic description for the phage 
as a unit of evolutionary change and any attempt to blend 
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Figure 38-5 Phylogenetic analysis of mycobacteriophage genes. The mycobacteriophage genomes are highly mosaic and it 
is not possible to draw aggregate phylogenetic relationships for the phages as whole genomic units that accurately reflect 
their evolutionary history. Panels A-D show the phylogenetic relationships of four genes that are present in at least seven of 
the genomes: A: lysB, B: capsid subunits, C: a putative tail protein, and D: a putative D-ala-D-ala carboxypeptidase. The genes 
of phages Omega, Cjwl, Che8, and Che9d (shown in boxes) clearly have different evolutionary histories. Trees were 
generated using the neighbor-joining method in ClustalX and displayed usingNJPlot. The results of bootstrap analysis with 
1000 reiterations are shown. 
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the many different histories into a single relationship dis¬ 
cards the inferred histories of individual modules (44). 

How do phage genomes become so pervasively mosaic? 
The most obvious conclusion is that the modules are 
exchanging by horizontal transfer from one genome to 
another in a way that does not require large segments of 
DNA homology. This is not to deny that phage replication 
continually generates mutations that are inherited by verti¬ 
cal transmission from parent to progeny, or that these 
mutations are re-sorted by homologous recombination—as 
they surely are—but these processes are not responsible for 
generating genomic mosaicism (35). Moreover, mosaicism is 
so prevalent that the mechanisms that give rise to it are 
anything but a minor contribution to the evolutionary 
process. 

Typically, mosaic boundaries are found at or close to 
the ends of protein coding genes. In the lambdoid genomes 
the modules frequently consist of blocks of structural 
genes (37, 44) (chapter 27), although in the mycobacterio- 
phages it is most evident outside the structural genes 
and modules typically are single genes. There are two plausi¬ 
ble methods by which mosaic boundaries could be gener¬ 
ated. First, exchange could occur at short conserved 
“boundary” sequences as first proposed by Susskind and 
Botstein, resulting in a recombinant configuration of the 
differing flanking genes (78). Secondly, recombination could 
be truly illegitimate and occur randomly with respect to 
nucleotide sequence information. In this case, most events 
would generate genomes that are inappropriately sized for 
packaging and likely interrupt genes required for viral 
propagation (11). Thus, only a minority of the genomic 
trash will generate viable progeny, with surviving exchange 
events perhaps typically occurring at or close to gene 
boundaries. 

While short conserved “boundary” sequences have been 
reported in some phages of E. coli (12, 67), these are not 
generally present in the mycobacteriophages in spite of the 
pervasive mosaicism of these phages. Illegitimate recombi¬ 
nation therefore seems a more likely mechanism, although 
acquiring supportive evidence is difficult since it is neces¬ 
sary to find events that have occurred recently, where the 
sites of recombination can be determined. Fortunately, 
comparative analysis of the mycobacteriophage genomes 
reveals several instances where two phages share almost 
identical DNA sequences, and there is one segment of 
378 bp in common between Corndog and Che8 that is 100% 
identical (57). The recombination events involved did not 
occur at gene boundaries, but rather within protein coding 
sequences—albeit not far from the ends of genes—without 
any evidence for sequence similarity. 

Who participates in these events? It is unlikely that 
these exchange events occur predominantly between two 
coinfecting phages since this is not a high-probability 
scenario in most natural environments. It seems more likely 
that recombination occurs between infecting phages and 


resident prophages. There are two important consequences 
of this supposition. First, in events where a resident pro¬ 
phage acquires segments of DNA, there is no immediate 
concern with size of the prophage. This may then partici¬ 
pate in subsequent events that are temporally separate from 
the first and provide a second chance to generate a genomic 
structure that can give rise to viable viral particles. 
Secondly, since these recombination events are occurring 
illegitimately, then recombination can occur between 
an infecting phage genome and any part of the resident 
bacterial chromosome. This accounts for the frequent 
presence of genes in the mycobacteriophage genomes that 
have been previously thought of as bacterial genes and 
supports the view that bacteriophages play a major role 
in horizontal genetic exchange between bacterial species. 

Gene Expression and Regulation 

The pattern of gene expression has been most closely 
investigated in the temperate mycobacteriophage L5 (9, 15, 
29, 34, 56). During lytic growth, two general patterns of 
expression are observed: synthesis of the right-arm genes 
early in the cycle followed by late expression of left-arm 
(i.e., structural) genes (see figure 38-2). A promoter re¬ 
sponsible for early expression, Pi e f t , is located at the right 
end of the genome and promotes leftward transcription 
early in lytic growth but is then downregulated late in 
the cycle (56). The promoter responsible for late expression, 
and the mechanism of activation, have yet to be described. 

Phage L5 lysogeny is maintained through the action 
of the repressor, gp71 (15). L5 gp71 is expressed in pro¬ 
phages by three upstream promoters and represses the 
Pieft promoter through binding to a 13 bp operator (56). 
Curiously, the L5 genome contains over 30 similar DNA 
sites to which L5 gp71 binds, nearly all of which are located 
within short intergenic spaces or overlapping gene starts 
and stops (9). The 13 bp consensus sequence is well con¬ 
served and clearly asymmetric, and the sites are in the 
same relative orientation with respect to transcription. 
Reporter gene studies indicate that the binding of gp71 
to these DNA sites prevents the passage of transcrip¬ 
tional complexes initiating from an upstream hetero¬ 
logous promoter (9). These binding sites are referred to as 
“stoperator” sites since they appear to stop transcriptional 
elongation and may play a role in ensuring tight downregu- 
lation of phage genes in lysogeny that would otherwise prove 
deleterious to bacterial growth. In the lambdoid phages a 
similar function may be provided by the many stem-loop 
terminators present throughout their genomes (chapter 9). 
Mycobacteriophage genomes contain few such termina¬ 
tors, and when present they are usually at the ends of 
operons (34). Phage Bxbl has a similar regulatory scheme 
to L5 and the heteroimmunity of the two phages can be 
accounted for by the relative binding specificities of the 
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repressors for their cognate operator and stoperator sites 
(39). The genome of Bxz2 appears to contain a similar 
regulatory system. 

Two promoter elements have been identified in phage 
Ms 6, located upstream of the lysis genes (25). Both are 
cr70-like promoters and are recognized by the host RNA 
polymerase. It is not clear how expression of the lysis 
genes is regulated, but the leader sequence between the 
transcription start sites and the first open reading frame 
can form two stem-loop RNA structures, one of which 
functions as terminator. The regulatory schemes in other 
mycobacteriophages have yet to be explored although 
WhiB-like transcription regulators in phages TM4, Che8, 
Che9d, Omega, and Cjwl are worthy of investigation. 

Integration and Excision 

Of the 14 sequenced mycobacteriophages, nine (L5, D29, 
Bxbl, Bxz2, Che8, Che9c, Che9d, Cjwl, and Omega) encode 
putative integration systems. These systems have been 
shown to be active in L5 (46), D29 (22, 69), and Bxbl 
(A.I Kim and G.F Hatful! unpublished observations). 
Active systems have also been demonstrated in Ms 6 (24), 
FRAT1 (26), and the M. tuberculosis prophage-like element 
4>Rvl (7). The integrases encoded by L5. D29, Che8, Che9c, 
Che9d, Cjwl, Omega, Ms6, and FRAT1 are all of the tyro¬ 
sine family of recombinases, whereas those of phages Bxbl, 
Bxz2, and 4>Rvl are serine recombinases. Several of these 
systems have been adapted for the development of genetic 
tools (see below), and phages L5, FRAT1, and Ms6 have 
all been shown to integrate into tRNA genes in M. smegmatis 
or M. tuberculosis (24, 26,46). 

Integration and Excision of L5 

The general scheme for integration and excision of L5 is 
similar to that of phage X (2). The L5-encoded integrase 
(L5-Int) catalyzes site-specific recombination between the 
phage attP site and the chromosomal attB site to promote 
integration, but requires the host-encoded mycobacterial 
integration host factor, mIHF (figure 38-6) (45, 59). Integra¬ 
tion generates two new attachment site junctions, attL and 
attR, and these act as substrates for excisive recombination, 
which is catalyzed by L5-Int and requires both mIHF and 
a second phage-encoded protein, L5-Xis, the product of 
gene 36 (48) (see chapter 7 for an overview of prophage 
integration and excision). 

The attP and attB sites for L5 integration share a 43 bp 
identical sequence (the common core) within which 
strand exchange occurs (46); L5-Int cleaves on either side 
of a 7 bp overlap region at the left end of the core (64). 
The functional attP site encompasses approximately 240 bp 
of DNA which contains two types of binding sites: core-type 
sites on each side of the sites of strand exchange and 
arm-type sites that flank the core (figure 38-6) (63). There 
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Figure 38-6 DNA sites for L5 integration and excision. The L5 
attP and M. smegmatis attB site each contain a 43bp common 
core which in attP is flanked by arm-type integrase-binding 
sites (PI, P2, P3, P4, and P5); two additional sites (P6 and P7, 
not shown) to the right of P5 — as well as P3 — are 
dispensable for both integration and excision. The mIHF host 
factor does not bind specifically to att site DNA but occupies 
the indicated areas in the presence of integrase. Integration 
requires both integrase (Int) and mIHF); excision requires Xis 
in addition to Int and mIHF (see chapter 7 for a general 
discussion of these mechanisms. Xis binds specifically to four 
closely spaced sites (XI-X4) adjacent to P2. 


are seven arm-type sites, but only four—two pairs of closely 
spaced sites—are needed for either integration or 
excision (figure 38-6). 

L5-Int is a far-distant relative of X integrase at the 
sequence level, but has a similar structural organization, 
with a small N-terminal domain that binds to arm-type 
sites and a large catalytic domain that binds to core-type 
binding sites (45, 63). L5-Int therefore has the potential 
to form intra- and intermolecular bridges with the two 
domains bound simultaneously to core- and arm-type 
sites. The role of mIHF, which does not bind by itself 
site-specifically to attP DNA, appears to promote intra¬ 
molecular protein-bridges by facilitating or stabilizing 
DNA bends (58). 

Integration can proceed through the DNA binding of 
phage L5-Int protein and host-factor mIHF to form an 
intasome complex that is then able to capture attB DNA 
through intermolecular Int bridges between the P1/P2 
arm-type Int binding sites in attP, and attB (60, 61). This 
process illustrates an interesting feature of L5-Int, in that in 
the absence of attB DNA. the P1/P2 pair of sites is unbound 
by L5-Int, even though L5-Int is present. When attB DNA is 
added, the intermolecular bridge is formed suggesting, 
perhaps, that the binding of Int to the P1/P2 arm-type sites 
is stimulated when the core domain is also occupied (60). 
This interaction can occur in the absence of intasome 
formation, suggesting an alternative assembly pathway for 
recombination (60). 

When both DNAs and both proteins are present, a syn¬ 
aptic complex is formed that is a direct precursor of a pre¬ 
sumed recombinagenic complex (60). Recombination by 
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this synaptic complex is stimulated by mIHF and by DNA 
supercoiling, which can be provided by either the attP or 
attB substrate (62). When the reaction products are formed, 
attL is released as an Intasome-L complex, whereas attR is 
observed by native gel electrophoresis as unbound DNA 
(60); however, DNA footprinting shows that the L5-Int 
protein remains bound to the core sites of attR with the 
mIHF protein also present and bound adjacent and to the 
left of L5-Int (Lewis and Flatfull, unpublished observations). 

The recombination directionality factor, L5-Xis, not 
only stimulates excision but also inhibits integration 
(J.A. Lewis and G.F. Hatfull, unpublished observations). 
L5-Xis acts by binding to four sites (X1-X4) located between 
the core and the P1/P2 arm-type sites and introducing a 
DNA bend (figure 38-6). This bend facilitates the formation 
of an intasome complex with attR DNA that contains intra¬ 
molecular bridges between the core and P1/P2 arm-type 
sites (J.A. Lewis and G.F. Hatfull, unpublished observations). 
The formation of this Intasome-R complex appears to be 
necessary for productive synapsis with an Intasome-L 
complex, although the reason is not yet clear. L5-Xis 
actively inhibits integration by preventing the conversion 
of synaptic complex 1 to the presumed recombinagenic 
complex within which strand exchange occurs (J.A. Lewis 
and G.F. Hatfull, unpublished observations). 

Integration and Excision of (|)Rv1 

The prophage-like element, 4>Rvl, is a resident of both 
sequenced strains of M. tuberculosis (13, 21) but is absent 
from the vaccine strain M. bovis BCG (50). It encompasses 
approximately 10 kbp and is bordered by two 12 bp direct 
repeats—of which there is only a single copy in BCG—and 
encodes a protein of the serine recombinase family of 
site-specific recombinases (75). Curiously, the element is 
situated at two different chromosomal locations in the 
two sequenced M. tuberculosis strains, suggesting that 
although it is unlikely to produce infectious particles it 
nevertheless is mobile. The recombination system has been 
shown to be active and utilizes an attB site that is part 
the REP13E12 repeat of which there are seven divergent 
copies in M. tuberculosis (7). The 4>Rvl element can inte¬ 
grate into four of these repeats and multiple integration 
events can be observed. A <j)Rvl gene encoding a recom¬ 
bination directionality factor has also been identified 
although the mechanism by which it influences the direc¬ 
tionality of recombination remains to be elucidated (7). 

Applications and Biotechnology 

Mycobacteriophages represent wonderful toolboxes for 
the development of mycobacterial genetics and novel 
methods for the control of mycobacterial diseases (30). 
While their full potential has yet to be realized, several 


useful advances have been made which will be briefly 
discussed here. 

Vector Development 

The integration systems of L5, FRAT1, and Ms 6 have all 
been used to construct integration-proficient plasmid 
vectors. In each case, a cassette containing the attP site and 
the integrase gene is inserted into a nonreplicating plasmid 
(which also contains an E. coli origin of replication) (24, 
27, 77). These integration vectors typically transform 
both fast- and slow-growing mycobacteria efficiently to 
generate recombinant strains with a single integrated 
copy of the plasmid inserted at the attB site. The advan¬ 
tages of these vectors are that they are maintained more 
stably than extrachromosomal plasmid vectors (46) and are 
present as a single copy, an important feature when pheno¬ 
typic effects resulting from multicopy plasmids are observed 
(6). Since integrase can mediate excisive recombination 
at low frequency in the absence of a recombination direc¬ 
tionality factor, further genetic stability can be achieved 
by providing the integrase gene on a nonreplicating plas¬ 
mid that is subsequently lost after integration of an attP- 
containing integration vector (63). 

The use of phage repressor genes provides a further 
aspect to vector development. L5 gene 71 encodes the phage 
repressor that is required for lysogenic maintenance, confers 
immunity to superinfection (15), and can be used as a geneti¬ 
cally selectable marker. The potential advantage of such 
a feature is that it avoids the use of drug-resistant genes, 
which are undesirable in the development of live recombi¬ 
nant vaccines (15). Furthermore, there is still a rather 
limited repertoire of positive selectable markers for mycobac¬ 
terial genetics and the application of a variety of hetero- 
immune phage immunity systems could help to alleviate 
this problem. 

Transduction 

Two mycobacteriophages have been reported to be capable 
of generalized transduction: 13 (66) and Bxzl (W.R. Jacobs 
and G.F. Hatfull, unpublished observations). Both these 
phages infect M. smegmatis and transfer genetic markers 
at varied frequencies. No phages have been described for 
general transduction of M. tuberculosis and this remains an 
important limitation on mycobacterial genetic systems. 

The use of recombinant mycobacteriophages for specia¬ 
lized transduction provides a useful method for gene re¬ 
placement or mutagenesis in M. tuberculosis (4). Targeted 
gene replacement can be problematical in M. tuberculosis 
since simple electroporation with nonreplicating plasmids, 
followed by selection, yields a high proportion of illegiti¬ 
mate recombination events, even when chromosomal DNA 
is present on the plasmid molecule (41). Specialized trans¬ 
ducing phages can be readily constructed by cloning 
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Polymerase chain reaction fragments of sequence that 
normally flanks the to-be-replaced gene into a cosmid 
vector such that these flanking sequences now flank a 
drug-resistance gene. This construct is then inserted into a 
conditionally replicating TM4-derived shuttle phasmid and 
recovered in E. coli via in vitro packaging into X particles. 
DNA of the new recombinant phasmids is then electropo¬ 
rated into M. smegmatis to generate infectious phage parti¬ 
cles, which can then be used to infect M. tuberculosis at 
nonpermissive temperatures. Typically, a high proportion of 
the bacteria selected for drug resistance contain a replace¬ 
ment of the gene of interest with the drug-resistance 
marker (4). If the replaced gene is essential then the infection 
can be performed in a strain into which a second copy of 
the gene has been introduced. 

Transposon Delivery 

Conditionally replicating phages are also useful as 
transposon-delivery systems (5). A series of shuttle phas¬ 
mids have been constructed in conditionally replicating 
phages that contain transposons such as mini-TnlO and 
Tn 5367 that can be used to infect either fast- or slow-grow¬ 
ing mycobacteria. Selection for the drug marker yields 
progeny in which the transposon has moved from the 
phage onto the bacterial chromosome. Similar phages have 
also been described for transposon delivery in M. paratuber- 
culosis (28). The locations of the mutations that give rise to 
the conditionally replicating phenotype of these phages 
have not yet been mapped. 

Use of Mycobacteriophages for Diagnosis 
of Tuberculosis 

Two phage-based systems for the diagnosis of tuberculosis 
infections have been described. One is the PhaB assay, in 
which a mycobacteriophage is used to infect a clinical 
sample suspected of containing M. tuberculosis and the 
number of phage particles generated is evaluated by plating 
on M. smegmatis ; the presence of M. tuberculosis in the 
sample can be inferred by an increase in the number of 
plaques (82). Furthermore, this method can be used to 
determine drug susceptibility profiles by examining the 
effects of antibiotics on phage production (19). Comparisons 
of this method with other traditional approaches for M. 
tuberculosis diagnosis and drug susceptibility testing have 
been conducted with encouraging results (16-19, 52, 81). A 
second method employs the use of recombinant myco¬ 
bacteriophages carrying the luciferase reporter gene (38, 
73). These FFlux reporter phages produce light when they 
infect M. tuberculosis and can be used to evaluate empirical 
drug susceptibility profiles by inclusion of antibiotics in 
the assay. This system offers good sensitivity and reliable 
drug susceptibility results and shows considerable promise 


in a clinical setting (3, 70-72). See chapter 46 for addi¬ 
tional discussion of phage-based bacterial identification. 


Concluding Remarks 

Mycobacteriophages have proven themselves to be valu¬ 
able toolboxes for mycobacterial genetics. However, as we 
begin to learn more about the genomic diversity of this 
group of bacteriophages it is quite evident that we know 
very little about the greater population of mycobacterio¬ 
phages in the biosphere. The 14 completely sequenced myco¬ 
bacteriophage genomes will require considerable dissection, 
both experimentally and bioinformatically, to understand 
the features that are already apparent. As the database of 
mycobacteriophage genomic information advances, we 
anticipate many new insights and expect a wealth of infor¬ 
mation that can be used to further our understanding of 
their pathogenic mycobacterial hosts. 
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B acteria in the genus Streptomyces are abundant in most 
soil environments (103). They are prolific producers of 
secondary metabolites, which often have antibiotic or other 
biological activity, and they have an unusual growth habit 
(57). A streptomycete colony grows initially as a vegetative 
mycelium, containing branching hyphal filaments with 
occasional cross-walls. As the colony matures, signals 
govern the production of aerial hyphae, which differentiate 
to form chains of spores. As spores are resistant to drought 
and can persist in the soil for many years, the high strepto¬ 
mycete count in soils is often attributable to the presence 
of spores (16). The isolation of phages from soil samples 
using a Streptomyces host is generally easy with most 
samples yielding phage after enrichment (57). Streptomyces 
coelicolor A3(2) is a model organism for this genus and has 
been studied most intensively. The linear genome contains 
8,667,507 bp (5). Although the genome does not have any 
obvious prophage sequences, it does contain about 11 
elements with features reminiscent of plasmids. 

Studies on phage biology often lead to the development 
of genetic tools for the manipulation of the host. Interest 
in Streptomyces species frequently relates to antibiotic bio¬ 
synthesis and to the use of genetic techniques to generate 
novel secondary metabolites (50, 56, 77). Thus, the ability 
to transfer DNA between species is highly desirable, but 
streptomycetes vary substantially in their susceptibility 
to genetic manipulation. Phage-derived tools have made a 
major impact on genetic engineering of Streptomyces and 
their close relatives, Saccharopolyspora (57). As it is impossi¬ 
ble to consider the life cycle of a phage without considering 
the host to which it is adapted, fundamental studies of 
Streptomyces phages have led to a deeper understanding of 
both phage and host. One temperate phage, 4>C31, which 
infects both S. coelicolor A3(2) and its close relative Strepto¬ 
myces lividans, has been studied extensively and has been 
exploited in the development of vectors and genetic tools. 
The excellent genetic studies by Lomovskaya and colleagues 
provided the sound basis for future work with <f>C31 
and other Streptomyces phages (65). Two comprehensive 
accounts of the development and nature of cloning vectors 


from c()C31 have been published (17, 57) and will not be 
discussed here. This chapter aims to summarize progress 
since the review by Lomovskaya et al. (65) on Streptomyces 
phage biology. 

Diversity and Evolution of 

Streptomyces Phages 

By far the most studied Streptomyces phage is (j)C31 and the 
major features of this phage are described below. However, 
many other actinophages have been isolated, not only with 
the intent to develop them into the genetic tools but also to 
satisfy the curiosity of researchers (table 39-1). All the 
phages have been isolated that infect Streptomyces, or the 
closely related genus Saccharopolyspora, are double-stranded 
DNA phages with similar morphologies, that is icosahedral 
heads and long, flexible tails (1). Where studied, the genomes 
are linear with cos ends or terminal redundancy and vary in 
size between 40kbp and 121 kbp. The ends of two phages, 
FP22 and <j)Al, are not of a known type, that is they are not 
cohesive but do appear to be discrete (32, 41). Most of the 
phages that have been studied are temperate and the att 
sites of several that have been shown to integrate into the 
host chromosome have been mapped on the phage genome 
(see below). An important exception to these phages is the 
Streptomyces fradiae phage, (j>SFl, which exists as a plasmid 
in its prophage from (17, 22). A deletion analysis on the 
prophage has identified essential and inessential regions of 
the genome and those required for plaque formation or 
immunity. A strain containing the plasmid form can cause 
“pocks” (which are zones of growth inhibition), a property 
typical of transferable plasmids in Streptomyces. 

Studies on phage genomics have indicated that double- 
stranded DNA phage genomes are mosaics, having 
frequently exchanged/acquired genes by horizontal gene 
transfer, and have access to a common gene pool (9,10, 45) 
(see chapter 4 and 27). A signature of horizontal exchange 
is a sudden change in the degree of sequence similarity 
when comparing one genome with another and this is 
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Table 39-1 Properties of Streptomyces and Saccharopolyspora Phages 


Phage 

Isolation host 

Genome 
size (kbp) 

% 

G+C 

DNA 

ends 

Phage-phage 

relationship 

Phage-host 

interactions 

Application 

References 

4>C31, 

4>C43, 

4>C62, 

4>BT1, 

4>SEA 

S. lividans 66 

Broad host range 

All around 
41.5 

63.6 

Cohesive 

All in the same immunity group. 
Similar at the DNA sequence 
level, but with additional DNA 
in some genomes and evidence 
of mosaicism. 4>C31 DNA cross- 
hybridizes to that of TGI. TCI 
and 4>A7 have similar genome 
organization of 4>C31 and 4>BT1 

Temperate. (J>C31 antimodifi¬ 
cation in S. aibus. Sensitive 
to Pgl. Inhibits host rRNA 
synthesis in lytic growth. 
Receptor probably a 
glycoprotein 

Cloning and 
integration 
vectors; 
plasmid 
transduction 

17, 21, 23, 
57, 65, 86, 
87, 92 

TGI 

4>A7 

4>A2, c|>A4, <T>A9 

S. cattleya 

Broad host range 

S. antibioticus 

Broad host range 

S. antibioticus 

Broad host range 

41 

46.7 

43, 49, 53 


Cohesive 

Cohesive 

Cohesive 

Related to 4>C31 by DNA 

hybridization, cross-reactivity 
with antisera, and similar 
genome organization 

Similar genome organization to 
4>C31, <t>BT1, and TCI 
Cross-hybridize with each other 

Temperate 

Temperate 

cpA2 avoids So/Pi restriction. 

(|)A4 avoids Sac restriction 

Phagemids; 
accommodate 
8.5 kbp 

33, 34 

31, 32 

32 

4>HAU3 

4>A5, 4>A6 

S. hygroscopicus 
Broad host range 

S. antibioticus Fairly 
broad host range 

51 

66 

Cohesive 

Terminally 

redundant 

Cross-hybridize strongly to each 
other and faintly to 4>A8 

Temperate. S. lividans resistance 
to 4>HAU3 is due to Ea59-like 
endonuclease 

Temperate 

Phagemids 

107, 108 

32 

R4 

S. albus J1074 
(R-IVT) 

Broad host range 

53.7 

67 

Cohesive 

R4 and SH10 have very similar 
restriction maps 

Temperate. Antimodification in 

S. albus 

Cloning and 
integration 
vectors. 

Plasmid 

transduction 

17, 20, 59, 
68, 71 

SHI 0 

S. hygroscopicus 
Broad host range 

~41 

68-73 

Cohesive 

Same immunity group as R4 

Temperate. Inducible by UV 


59, 60 

VWB 

S. venezualae 

Narrow 
host range 

43.7 

69.3 

Cohesive 

Genome organization similar to 

R4. Coat proteins distantly 
related to those of X 

Temperate 

Cloning and 
integration 
vectors 

2, 3, 100 

cj>A8 

S. antibioticus 

Broad host range 

50 

— 

Cohesive 

Probably homoimmune with R4 

Temperate 

— 

32 

SAtl 

S. azureus 

~37 

71 

— 

— 

Temperate 

— 

72 

RP2, 

RP3 

S. rimosus 

Narrow 
host range 

64.7, 62.4 

70 

Cohesive 

~400 bp cross-hybridization 
between RP2 and RP3 

Temperate 

Plasmid 

transduction, 

integration 

36, 58, 74 


vectors 
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SC623, 

SC681, 

SC347 

S. coelicolor (Muller) 

57 

68-71 


SC681 and 623 are homoimmune 

Temperate 


80 

FP43 

S. fradiae, 

68 


Terminally 

— 

Temperate 

Plasmid 

69 


S. griseofuscus 
Broad host range 



redundant 



transduction 


4>A1 

S. antibioticus 

100 


Discrete 

— 

— 

— 

32 

FP22 

S. fradiae, 

131 

46 

Discrete 

Homoimmune with P23, and 

Temperate. Refractory to 


41 


S. griseofuscus 
Broad host range 




cross-hybridize 

restriction 



4>SF1 

S. fradiae 

82 


Terminally 


Prophage is a plasmid and can 

Generalized 

22 





redundant 


form pocks. Two prophage 

transducing 








forms differ in their frequency 
of prophage induction 

phage 


4>SV1,3,9, 

S. venezualae 

45 

— 

Terminally 

SV1, 9, 11 are different immunity 

Temperate 

Generalized 

94, 101 

10-12 

Narrow host 



redundant 

groups but cross-hybridize 


transducing 



range 






phages 


DAH1, 

S. coelicolor A3(2) 

93-121 

— 

— 

— 

— 

Generalized 

12 

DAH2, 

Also plague on 






transducing 


DAH4- 

S. avermitilis and 






phages 


DAH6 

S. lividans 








JHJ3 

Saccharapolyspora 

41.1 

70 

Cohesive 

JHJ1, JHJ2 are derivatives of JHJ3 

JHJ3 temperate. JHJ1 is virulent 

Plasmid shuttle 

37, 38, 42 

(JHJ1, 

hirsuta. JHJ1 and 





JHJ1 and JHJ2 can form 

vectors 


JHJ2) 

JFIJ2 broad host 
range in the 
Saccharopolyspora 





invasive plaques 




genus 








4>C69 

Saccharopolyspora 

40 

— 

Cohesive 

— 

Virulent. Displays “lysis from 

— 

55 


erythreae Narrow 





without” in some Saccharo 




host range 





polyspora strains (e.g., 
NRRL2359) 



121, 

Saccharopolyspora 

41.9 (121) 

57.5-62.5 

Cohesive 

Homoimmune and cross- 

Virulent. SE-3 shows “lysis from 


78, 11, 93 

SE-3, 

erythreae Narrow 

42.2 (SE5) 



hybridize strongly, except 

without” with NRRL2359 



SE-5 

host range 




in central region. 121 cannot 
propagate on 4>FR113/114 
lysogens 




4>FR113/ 

Saccharopolyspora 

43, 42 


Cohesive 

Homoimmune and cross- 



79 

4>FR114 

rectivergula 




hybridize strongly, except 





(formerly Faeria 
rectivergula) 




in central region 
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frequently seen in comparisons of lambdoid genomes (54). A 
study of phage genomes of S. coelicolor (Muller) and Saccharo- 
polyspora (previously Faenia) using restriction endonu¬ 
clease mapping also indicates that mosaicism is rife in the 
actinophages (78, 80). Heteroduplex analysis of DNA of 
(j)C31 and its homoimmune relative, 4>C43 (isolated from 
S. lividans 803), showed that they are highly homologous for 
most of their lengths but also that <j>C43 has an insertion, 
believed to be an IS element, and a short tract of nonhomol¬ 
ogy in a region toward the center of the genome (65, 86, 87). 
More recently, a comparison between the DNA sequences of 
the complete genomes of (j>C31 and a homoimmune phage, 
(|)BT1, also indicated sudden changes in sequences similarity 
between adjacent genes from 81% to <17% similarity at 
the amino acid level (39). There are also segments of DNA 
that are present in one genome with no comparable DNA in 
the other and, as discussed below, these segments encode 
additional genes (39). 

Because of frequent horizontal exchange of genes, the 
relationship between phage that infect the same genus is 
frequently obscure; indeed, the closest relatives to some 
4>C31- or (j)BTl-encoded proteins in the database appear to 
be in phages that infect hosts evolutionarily very diverged 
from Streptomyces (30, 45, 92). For instance, there is little or 
no significant sequence similarity between the genes encod¬ 
ing the coat proteins of the actinophages <j>C31 and VWB 
(2, 92). In fact, 4>C31 head-assembly proteins are similar 
to those of a host of non-streptomycete phages including 
coliphage HK97, whereas the coat proteins from VWB are 
related to phage X coat proteins. These observations support 
the idea that among streptomycete phages there are at 
least two lineages of head protein gene clusters (9). 

Another feature of phage evolution is the maintenance 
of genome organization within families of phage. With the 
lambdoid family this facilitates homologous recombination 
and leads to the rapid dissemination of recently introduced 
genetic material within the gene pool (14,15). Other families 
of phages frequently exhibiting significantly different life¬ 
styles such as the Mu family (70) (see chapter 30) or N15 
(75) (see chapter 28), maintain a different genome organiza¬ 
tion to the lambdoid family (see chapter 27). Examination of 
the organization of the Streptomyces phages suggests that 
they might also be classified into families. The organization 
of the cj>C31 genome is known in detail and is described 
below. However, even before the (j)C31 sequence was avail¬ 
able an overview of its genome organization was gleaned 
from the position of the attP/int locus and the locations of 
the deletable regions in the temperate phages (18,19,43, 64). 
In phages c()C31, cj)A 7, and TGI the attp/int lies close to one of 
the cos ends (18, 31, 34). In all three phages the deletable 
regions are either adjacent to the attP/int region, (the ines¬ 
sential early region) or in the center of the genome, the 
latter being associated with a clear-plaque phenotype (the 
c locus). Some limited sequencing of restriction fragments 
from TGI confirmed the similarity of genome organization 


between TGI and (j)C31 (S. Sharp and M. C. M. Smith, unpub¬ 
lished data). 

Phages R4, cj>A8, VWB, and RP3 appear to belong to at 
least one other family on these, albeit crude, criteria. The 
locations of attP/int in R4, RP3, and VWD are closer to the 
centers of the genomes compared with cj>C31 (36, 82, 100). 
In VWB the int/attP is just to the left of center and the coat 
protein genes are located on the right arm. A derivative of 
SH10 (a phage that has a similar restriction map to R4) has 
a deletion close to one end of the genome and under certain 
plating conditions has a clear-plaque phenotype (102); if this 
deletion indicates the position of the repressor gene, its posi¬ 
tion is very different in phage SH10 compared with that in 
4>C31. Other phages cannot be classified with these criteria; 
(j)HAU3 also has attP at one end but insertions to generate 
phagemids have occurred at the other end of the genome in 
a position that in cj>C31 would coincide with the essential 
terminase and coat genes (107). No information is available 
for other phages, but on the basis of genome size alone it is 
highly likely that the FP43, FP22, (j>Al, and DAH phages 
belong to yet other families (table 39-1). 

Streptomyces Temperate Phage <|>C31 

The most extensively characterized phage of Streptomyces 
spp. is the temperate phage <f>C31. This phage was isolated 
in the late 1960s, ostensibly from a culture supernatant of 
S. coelicolor A3(2) plated on S. lividans 66 (63). Subsequent 
studies indicated that <f>C31 must have come from external 
contamination (18). (j)C31 has a polyhedral head, a noncon- 
tractile tail, and a baseplate from which short tail fibres 
protrude (95) (figure 39-1). Fomovskaya and colleagues 
studied adsorption, stability, and one-step growth curves, 
mapped the cj>C31 chromosomal attachment site ( attB ), and 
undertook a genetic analysis of the phage genome. More¬ 
over, they discovered the phage growth limitation pheno¬ 
type (Pgl) phenotype in S. coelicolor. The following sections 
expand on these discoveries. 

Cenome Organization of (j>C31 

The complete genome sequence of cj>C 31 has been deter¬ 
mined (44, 61, 92). The cj>C31 genome is 41,491 bp in length 
and contains 63.6% G+C (figure 39-2). The arrangement 
of genes indicates that transcription of all genes except 
one is in the left to right orientation. As in other phage 
genomes, the early and late genes are clustered, with the 
late genes on the left arm and the early genes on the right 
arm. This was first suggested from the genetic analysis of 
temperature-sensitive (ts) mutants of <f>C31 later confirmed 
by SI mapping (53, 65, 96). There is a further subclustering 
of genes according to function, for example head and tail 
assembly, DNA packaging, and DNA replication (figure 39-2). 
Comparisons of phage genomes have indicated that the 
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Figure 39-1 Electron micrographs of phage 4>C31. 
Micrograph of R. Hendrix, Pittsburgh Bacteriophage 
Institute. 


mosaicism of sequence relatedness between phages is not 
random; the transition point between sequence similarity 
and divergence frequently lay at the boundaries of so-called 
functional modules (15, 54, 99). If genes for a particular 
function are clustered and genetic exchange occurs, recom¬ 
binants are less likely to lack an essential component than 
if the genes are scattered. 

Overall the late genes of (j>C31 have a similar gene 
organization to many other phage late regions (9). The 
major capsid and tail coat proteins were identified by 
N-terminal sequencing and located on the gene map of 
4>C31 (92). Other structural genes encoding coat proteins 


or putative packaging genes were identified by homology 
to other phage genes in the database. The gene products in 
the (j)C31 head assembly module are similar to those in a 
group of phages that include coliphage HK97 (92). The path¬ 
way for head assembly has been studied in detail in HK97 
(25, 35, 104) and it is likely that it is similar in 4>C31. 
The arrangement of the tail genes also resembled that of 
many other phages and includes the presence of the puta¬ 
tively frameshifted open reading frame (ORF), g42, and 
adjacent to this, a putative tail tape measure gene, g43 
(92). The product of g49 was also of some interest due to 
the presence of collagen-like repeats, and this is found to be 
quite common in putative tail fiber proteins (89). The early 
regions are generally more variable between different 
phages than the late regions (9). In phage 4>C31, gll, which 
is found in the early region, encodes a putative DNA poly¬ 
merase. Although considered unusual for most temperate 
phages, genes encoding DNA polymerases are common in 
the mycobacteriophages (44, 73) (which are reviewed in 
chapter 38). Adjacent to gll in (j)C31 is g9a encoding a 
putative primase/helicase similar to many other phage- 
encoded primases including the a protein from phage P4 
(109) (phage P4 is reviewed in chapter 26). It is tempting to 
speculate that the mode of replication might be similar to P4, 
that is bidirectional replication beginning at either of two 
origins (6). Whilst still on the theme of DNA synthesis, (j)C31 
and some mycobacteriophages appear to encode enzymes 
for the metabolism of nucleotides. In 4>C31, gl5 through 
g28 and int lie within the region termed “inessential”, as 
deletions can be generated in this region without affecting 
plaque formation (17, 92). This region contains two genes: 
g20, encoding a putative dCMP deaminase with close simi¬ 
larity to gene 36.1 from mycobacteriophage D29, and gl6, 
encoding a putative alternative thymidylate synthase and 
similar to g48 from both L5 and D29. <f>C31 also contains 
g52 which encodes a putative nucleotide kinase, most simi¬ 
lar to the T4-encoded homolog (phage T4 is reviewed in 
chapter 18). Although the idea has never been tested, all 
three of these genes may be advantageous to cj)C31 replica¬ 
tion by modifying the nucleotide pools during lytic growth 
to maximize the nucleotide pools during lytic growth to 
maximize the burst size (see chapter 5 for a discussion of 
phage growth parameters such as burst size). 

The genome of phage 4>BT1 is extremely similar to that of 
phage (j)C31 with most of the genes in the late arm encoding 
proteins between 73 % and 96% identical to the <j>C 31 homo¬ 
log (39). The early regions contain more variation, with 
amino acid sequence homologies between < 30% to 92% 
identity. Moreover, several genes are present in one phage 
without a corresponding homolog being present in the 
other. One of these is 4>C31 gl6 (discussed above), which 
is predicted to encode an alternative thymidylate synthase 
and is not present or required in phage c()BTl. On the other 
hand (j)BTl, but not cj>C31, encodes a putative DNA methyl- 
transferase, located between g9a and gll, two genes that. 
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Figure 39-2 Genomic organization and transcription map of phage 4>C31. The genome is represented by a line, broken at the 
middle, with the left arm being shown in the upper section and the right arm in the lower section. The ends represent the 
cohesive (cos) ends of the linear genome. The genes are represented by boxes, some of which are numbered, with the late 
genes in gray, early genes in white, and immediate early genes (repressor, c; integrase, int; gene 32) in black. The putative 
activator, gene 12, is represented by a stippled box. The arrows below the genome represent transcripts; black are early or 
immediate early and gray are late. The dotted lines are putative transcripts that have not been directly observed by SI 
mapping. The positions of early promoters (black arrowheads), late promoters (gray arrowheads), immediate early 
promoters (bent arrows) and terminators (stem-loop icons) are shown. Diamonds represent the repressor binding or 
conserved inverted repeat (CIR) sites. 


respectively, encode homologs of DNA primase and DNA 
polymerase (figure 39-2; 39). This additional gene in 4>BT1, 
g9.1, could be part of an antirestriction strategy for 4>BT1 or 
alternatively part of a superinfection exclusion system for 
4>BT1 lysogens. Downstream of g9.1 is g9.2, also not present 
in cj)C31, a gene that contains no recognizable functional 
motifs. However, these two genes, g9.1 and g9.2, have homo¬ 
logs in mycobacteriophage Corndog, that is g7 and g6 (73) 
(see chapter 38), and the fact that they are found together in 
both 4>BT1 and in Corndog suggests that the products of 
these two genes may function together. 

Infection 

The infection process requires adsorption of the phage 
particle to a cell wall receptor and then injection of DNA 
into the host cell. Lomovskaya and colleagues studied the 
relationship between adsorption and successful infection 
of S. coelicolor (65). Phage particles reversibly adsorb to 
spores. Only germlings of about 5 hours old yield productive 
infections, however, and these represent only about 20% of 


the particles absorbed to germlings. Recent studies have 
shed light on the nature of the <j>C31 receptor (27). Mutants 
of S. coelicolor strain J1929 (a stable Pgl~ mutant:see below) 
resistant to 4 , C 31 were isolated by UV or spontaneous 
mutation and shown to plate the phage at extremely low 
efficiency. These strains could support a phage burst if the 
DNA was introduced by transfection, indicating that the 
defect was early in the infection cycle. Several of the phage- 
resistant strains, however, could support growth of a (j)C31 
derivative isolated originally in Lomovskaya's laboratory 
called (j)C31/ic. This phage was itself isolated as being 
able to grow on a phage-resistant derivative of S. Iividans 
and it seems likely that phage cj>C31/;c has compensated 
for alteration of the host receptor. Restoration to phage 
sensitivity was achieved in one of the resistant S. coelicolor 
strains, DTI017, by the introduction of a gene, SC03154, 
encoding a homolog to the eukaryotic dolichol phosphate- 
D-mannose:protein O-D-mannosyltranferase (27). Phage 
sensitivity could be restored to other phage-resistant 
mutants by the introduction of SC01423, a gene encoding a 
homolog of dolichol phosphate-(f-D-mannose synthase (26). 
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Together these findings strongly indicate that the receptor 
for wild-type cj)C31/ic is a glycoprotein. Possibly 4>C31/ic 
can use the unmodified or incompletely modified cell wall 
protein, which is yet to be identified, as a receptor. Ln support 
of this model, a phage, (j)DT4()02, was isolated by purifi¬ 
cation of a derivative of 4>C31cA25 that can grow with a 
high efficiency of plating on S. coelicolor DT1017 (i.e., the 
same phenotype as c()C31/ic) and the late genes sequenced; 
compared with its direct parent, phage c()C31cA25, g44 in 
4>DT4002 contained a missense mutation, implicating gp44 
in interactions with the host receptor (figure 39-2). 

Induction into Lytic Growth 

S. coelicolor lysogens are not sensitive to UV (76). Indeed, 
the physiological conditions that stimulate induction of 
lysogens into lytic growth are not known although high 
levels of induction are observed during growth of young 
mycelia (76). Indeed, ts lysogens of S. coelicolor could not 
produce progeny phage if induced before 3 hours or after 10 
hours post-induction (76). The inability of aging mycelia to 
produce phage particles was studied further by Suarez and 
colleagues and found to be due to a block in transcription 
(96). Thus, c()C31 appears to have a mechanism to ensure 
maximal phage growth during rapid vegetative growth of 
the host. Interestingly, studies with phage-host interactions 
in soil confirm these observations (see below). 

In contrast, a lytic phage, JHJ-1, that infects Saccharopoly- 
spora hirsuta appears to be able to grow on aging mycelia 
(42). The plaques formed by JHJ-1 consequently are not self- 
limiting and continue to grow for several days, a phenotype 
reminiscent of coliphageT7 plaque growth (see chapter 20). 

Transcription in the Lytic Cycle 

Genetic loci required for lytic growth were characterized by 
the isolation of ts mutants for phage replication (65). The 
time at which growth was sensitive to temperature indicated 
that all except one isolate were mutants in loci required for 
late lytic growth. These loci were mapped by 2- and 3-factor 
crosses; late genes lay to the left of the clear-plaque locus and 
the single early mutation mapped to the right. SI mapping 
experiments confirmed this organization using RNA 
prepared from induced ts lysogens containing a c(>C31 ctsl 
prophage containing a mutant allele of the repressor locus, 
c (53, 96). Fine-mapping of the early region showed that 
discrete transcripts were maximal 10 minutes post-induc¬ 
tion and were almost absent by 20 minutes (figure 39-2) 
High-resolution mapping of the 5' endpoints showed that 
transcription in the early region initiates at multiple promo¬ 
ters, and a highly conserved sequence was identified in all 
these promoters (51). Late genes are expressed from a promo¬ 
ter, Ipl, at the start of the late operon and transcription pro¬ 
bably continues through to a terminator located upstream 


of g50 (figure 39-2; 48, 49). A second late promoter, lp2, is 
located in the late operon. 

Using transcriptional fusions the timing of transcription 
of the phage promoters was examined. An early promoter, 
epd, and a late promoter, Ipl, were separately fused to xylE 
and introduced into a (J>C31cts lysogen (48, 51). Temperature 
induction synchronously induced the phage lytic cycle 
catechol 2,3-dioxygenase activity was maximal at 20 and 
40 minutes, respectively, for the early and late promoters. 
Neither promoter was active in the absence of induction. 
The phage promoters are probably activated by a phage- 
encoded protein and circumstantial evidence implicates 
the product of gl2 as being the putative activator. gl2 is 
expressed from two promoters: an immediate-early promo¬ 
ter, apl, and a phage-specific early promoter, epf Thus gl2 
is transcribed immediately upon entering the cell and 
then transcription continues once the lytic cycle is under 
way due to a proposed auto-activation via epf. 

The promoter apl is repressed by an operator, CIR6, which 
is a binding site for the phage repressor (see below) and this 
provides a mechanism for controlling the lysis versus lyso¬ 
genic decision (105). Furthermore, a virulent mutant of 
<j)C31, <J>C31virl, was shown to contain a DNA rearrange¬ 
ment, resulting in the complete separation of the —10 pro¬ 
moter sequence of apl from the CIR6 operator. The resulting 
unregulated expression of gl2 could explain the virulent 
phenotype (105) Unfortunately, all attempts to clone gl2 in 
the absence of the phage repressor have failed and we 
propose that this is due to toxicity of the gl2 gene product. 

Studies directed at host transcription during the (j)C31 
lytic cycle have shown a remarkable reduction in rRNA 
synthesis and possibly transcription of all host genes 
(23, 76). This type of phenomenon is more reminiscent of 
lytic phages such as T4 (see chapter 18). Although the 
mechanism for the reduction in host rRNA synthesis is not 
clear, it does require protein synthesis after the start of the 
lytic cycle, consistent with the synthesis of a phage-encoded 
protein. These studies also noted a stimulation of (j)C31 
transcribing activity in crude preparations of RNA polymer¬ 
ase prepared from induced cultures, perhaps suggesting 
that the activator of phage promoters (putatively gpl2) is 
associated with RNA polymerase (23). 

Lysogenic Growth 

Lomovskaya and colleagues isolated clear-plaque mutants, 
including ts and dominant c mutants as well as the very 
rare virulent mutants, which do not grow on bacteria lyso¬ 
genic for c()C31 (65). All the clear-plaque mutants mapped to 
a single locus, c, located in the center of the phage genome 
(85). The repressor gene was isolated by its ability to confer 
(|>C31 resistance (83). Transcription mapping and analysis of 
the expression of the repressor locus indicated that three 
in-frame proteins—the 74, 54, and 42 kDa isoforms—are 
naturally produced from the c locus (84, 91). Plasmids that 
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could only express the 54 and 42 kD a isoforms conferred 
full immunity to superinfecting phage. Purified 42 and 
54 kDa isoforms were found to bind to a conserved inverted 
repeat sequence (CIR) located at 16 places in the phage 
genome (52,105,106). One of these CIRs appears to be crucial 
for regulation of gl2 and therefore the lytic cycle (see above). 
Others appear to autoregulate the repressor (52, 84). 
The roles of the remaining sites are not clear. The role of 
the 74 kDa repressor is also not understood, although it 
has been suggested to act as an antirepressor (90). 

For most temperate phages the establishment of lysogeny 
requires two events:the repression of lytic genes and the 
site-specific recombination of the phage genome into a 
particular target ( attB ) in the host genome (see chapters 7 
and 8). The (j>C31, 4>BT1, and R4 recombination systems are 
unusual compared with most known phage recombination 
systems as they employ integrases from the serine recombi- 
nase family and are unrelated to the phage X integrase- 
like recombinases of RP3 and VWB (24, 36, 40, 61, 82,100). 
Phage-encoded integration and excision is a highly regu¬ 
lated process to ensure that the reaction is appropriate to 
lytic or lysogenic growth. In phage X, transcriptional and 
post-transcriptional mechanisms operate to ensure that 
the appropriate ratio of Xis to Int is expressed for integra¬ 
tion and excision. Both the RP3 and VWB recombination 
systems also have recognizable xis genes located adjacent 
to the int genes (36,100). In vitro studies on the properties 
of 4>C31 integrase show that the recombinase causes inte¬ 
grative recombination with attP and attB but not excisive 
recombination (see chapter 7 for a general discussion of 
phage integration and excision). Thus, phage (j>C31 is 
also thought to encode a Xis protein to direct excision 
by integrase, but this has not yet been identified. An 
interesting aspect of the control of <j)C31 int expression 
is that the int promoter straddles the crossover site (61). 
After integration the promoter is therefore separated 
from the int ORF. Moreover, the attB site lies within a 
host ORF, SC03798, and the direction of integration 
precludes transcription of int from the promoter of this 
host gene. 

Restriction and Antirestriction in 

Phage Infection 

It is generally thought that restriction of foreign DNA is 
the host property that, in addition to adsorption sensitivity, 
most limits the phage host range (28). Phage-sensitive 
mutants have been isolated from resistant strains and this 
has revealed the presence of type II restriction systems 
(e.g., Streptomyces albus G encoding Sail and S. albus P 
encoding SnlPI) and the unusual phage defence system, Pgl, 
in S. coelicolor A 3 (2). The high selection pressure to overcome 
restriction in phage genomes has sometimes resulted in 
removal of sequences that act as targets for restriction endo¬ 


nucleases (81). This has been shown to be true for several 
Streptomyces phages (see data compiled in 17). This is 
very much in line with the empirical findings of Cox and 
Baltz (28). 

Now that we have the sequences of two complete Strepto¬ 
myces phage genomes, the analysis can be more precise. 
Both (j)C31 and <f>BTl sequences show a severe reduction in 
the expected number of sites for several enzymes produced 
by Streptomyces species whilst others, such as Sail, Sapl, 
Sful, Ssbl, and Ssp 52301, have considerably higher than 
expected numbers (table 39-2). Possibly the high frequency 
of sites for these enzymes reflects the presence of antirestric¬ 
tion mechanisms, and it is of interest that one phage, 
FP43, does appear to display antirestriction (69). Further¬ 
more, the antimodification system displayed by phages R4 
and 4>C 31 to Sail may also be part of the avoidance of restric¬ 
tion (see below). Of particular interest is the number of SacAI 
sites in c(>C31 versus <f>BTl; the former appears to have 
removed SctcAI sites and the latter appears to have increased 
their number, suggesting the presence of a mechanism 
conferring immunity against SficAl restriction in <f>BTl. 
The sequence of (j)BTl reveals the presence of a putative 
methyltransferase, absent in <j>C31 and which may be part 
of an antirestriction strategy (39). Another point that was 
clear from the analysis by Chater (17) is that several of 
the restriction sites that appear to have been avoided by 
Streptomyces phages are for enzymes that are common in 
many Streptomyces species, perhaps the best examples 
being CTCGAG and CCGCGG (table 39-2). 

Streptomyces phage FP22 has an extremely effective 
mechanism for evading restriction. This Phage appears to 
contain none of the sites for characterized Streptomyces 
type II restriction systems, even Mbol recognition sites 
(41). Possibly this reflects a general modification of 
phage DNA. 


The “Antimodifying” g Locus in 
R4 and (pC31 

Phages R4 and <j)C31 cannot form plaques on S. albus G but 
can on a R~M + mutant of S. albus (17, 20). Surprisingly, 
however, the phages derived from the R~M + mutant are 
still restricted on S. albus G. Mutants can be isolated, called 
g mutants, that will plaque efficiently on S. albus G. DNA 
isolated from g + R4 or (j)C31 grown on the R~M + strain are 
still restricted with Sail but the g~ mutants are modified and 
are resistant to cleavage. The phenotype of the wild-type 
phage therefore appears to be antimodifying, such that 
the phage can grow in an R~M + host but does not become 
modified. Now that it is clear that many Streptomyces host 
strains contain methylation-specific restriction systems 
(MSR), it appears to make sense that the phage resists modi¬ 
fication in case its next host carries an MSR system (66). 
Indeed, it has been shown that R4 g~ but not R4 g + was 



MOLECULAR GENETICS OF STREPTOMYCES PHAGES 629 


Table 39-2 Frequencies of Recognition Sites in 4>C31 and cf)BT1 for Streptomyces and Saccharopolyspora Restriction 


Enzyme 3 

Source 

Isoschizomers 
from other 
Streptomyces spp. 

4>C31 

4>BT1 

Recognition 

site 

Expected 

SocAl 

S. achromogenes ATCC 21353 

6 

2 

206 

CCCGCC 

43 

Sod 

S. achromogenes ATCC 12767 

1 

0 

0 

CAGCTC 

14 

Sodl 

S. achromogenes ATCC 12767 

15 

2 

0 

CCGCGG 

43 

SocNI 

S. achromogenes N-J-H 


159 

35 

GRGCYC 

106 

So/All 

S. albus ATCC 21725 

1 

128 

389 

GATC 

139 

SolDI 

S. albus ATCC 21132 


28 

11 

TCGCGA 

14 

Sal Gl 

S. albus C ATCC 29789 


48 

28 

GTCGAC 

14 

So/PI 

S. albus P CMI52766 

5 

0 

0 

CTGCAG 

14 

SonDI 

Streptomyces sp. 

1 

8 

11 

GGGWCCC 

16 

Sopl 

Saccharopolyspora sp. NEB 597 


38 

42 

GCTCTTC 

2.6 

SouHl 

S. aureofaciens 13 


0 

0 

CCTNAGG 

14 

Sbfl 

Streptomyces sp. Bf-61 

2 

0 

0 

CCTGCAGG 

1.4 

Seal 

S. caespitosus 


0 

0 

AGTACT 

4.6 

Scol 

S. coelicolor ATCC 10147 


0 

0 

GAGCTC 

14 

SexAl 

S. exfoliates 


1 

1 

ACCWGGT 

5.1 

Sf/I 

S. fimbriatus ATCC 15051 


0 

1 

GGCCNNNNNGGCC 

4.3 

Sfr214\ 

S. fradiae 274 

22 

0 

0 

CTCGAG 

14 

Sfui 

S. fulvissimus 

10 

15 

21 

TTCGAA 

4.6 

Sgfl 

S. griseoruber 


0 

0 

GCGATCGC 

1.4 

Sghl 8351 

S. ghanaensis 1835 


94 

84 

GGWCC 

154 

Sgr201 

S. griseus K20 

1 

108 

115 

CCWGG 

154 

SgrAI 

S. griseus 


2 

3 

CRCCGGYG 

10.7 

Sib 1 

S. albidoflavus Viikki 329 


2 

7 

GGTCTC 

14 

Snol 

S. novocastria 


9 

12 

GTGCAC 

14 

So/33351 

S. olivaceus IMRU3335 


0 

0 

CAGCTG 

14 

Sph\ 

S. phaeochromogenes NRRL B-3559 

1 

8 

10 

GCATGC 

14 

Spvl 

S. parvus NRRL B-1255 


0 

0 

GGATCC 

14 

Srf 1 

Streptomyces sp. 


0 

0 

GCCCGGGC 

4.3 

Ssfal 

S. scabies 


17 

19 

AAGCTT 

4.6 

Sse232l 

Streptomyces sp. RH232 


0 

3 

CGCCGGCG 

4.3 

Sse86471 

Streptomyces sp. 8647 


1 

2 

AGGWCCT 

5.1 

SseAl 

Streptomyces sp. 


64 

58 

GGCGCC 

43 

Ssp 21 

Streptomyces sp. RFL2 


259 

239 

CCSGG 

270 

Ssp5230l 

Streptomyces sp. 5230 


53 

52 

GACGTC 

14 

SspBI 

Streptomyces sp. 

1 

7 

4 

TGTACA 

4.6 

SstIV 

S. Stanford 


19 

14 

TGATCA 

4.6 

Stul 

S. tubercidicus 

4 

0 

0 

AGGCCT 

14 


a lnformation on restriction endonucleases from Streptomyces sp. is from http://rebase.neb.com/rebase/ 


restricted by Streptomyces parvulus after propagation on S. 
fradiae NRRL F1144 (17). It is now known that S. parvulus 
contains an MSR system (66) so if S. fradiae is a modifying 
host then only the g + phages (antimodified) should avoid 
restriction. The g locus has been mapped by deletion analysis 
and by recombination and shown to be located toward the 
left of the inessential region (64, 88). 

(j)C31 grown in S. albus R~M + is apparently restric¬ 
ted approximately 100-fold on S. coelicolor A3(2) whereas 
4>C31 containing the Moscow deletion (AM) was not 
restricted (17). Wild-type <f>C31 therefore appears to provide 
a substrate for restriction that is not present in the AM 
derivatives. AM removes all or part of genes 19-23 in 
the (j)C31 inessential region (C. Finnis and M. C. M. Smith, 
unpublished data). Other phages that appear to be able 


to avoid restriction systems include <j)A4 and <j>A2. 
The former avoids the Sac enzymes and the latter the So/Pl 
enzyme (32). 

The Phage Growth Limitation (Pgl) 

Phenotype in 5. coelicolor A3(2) 

S. coelicolor A3(2) is naturally resistant to (j)C31 infection 
(65). Whilst this was thought initially to be due to the 
presence of a resident 4>C31 prophage that was conferring 
immunity, this was refuted when Southern analysis was 
performed on S. coelicolor genomic DNAwith c|)C31 DNA as 
a probe; no DNA homologous to c|>C31 DNA was detected 
(18). The resistance phenotype was later found to be due to 
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the novel phage defence system conferred by Pgl (21). Briefly, 
4>C31 produced from a strain, such as S. lividans, that does 
not naturally possess the Pgl system (or a Pgl mutant of 
S. coelicolor ) is able to infect wild-type Pgl + strains produced 
a normal burst of progeny phage. However, when these 
progeny phage are used to infect fresh Pgl + bacteria, the 
number of second-generation phages produced is greatly 
attenuated. In the model proposed by Chinenova et al. (21), 
progeny phage are modified after growth in a Pgl + host and 
this activates a mechanism to inhibit further phage replica¬ 
tion. The modification does not inactivate phages released 
from Pgl + hosts, as they can efficiently infect a Pgl~ host to 
give successive rounds of normal phage bursts. One can 
speculate that such a strategy is particularly advantageous 
to a Streptomyces colony; the sacrifice of one cell to phage 
infection protects the rest of the clone but additionally 
amplifies the phage so that competing species in the same 
niche that do not encode Pgl are more likely to be infected 
and thereby eliminated. 

It is now known that four genes are required for the Pgl 
phenotype, pglWXYZ (4, 62, 97). The pglWX and pglYZ loci 
are only separated by about 6 kbp but the genes between 
them apparently are not required. A plasmid containing 
pglWX and pglYZ can confer the Pgl phenotype on S. lividans, 
a close relative of S. coelicolor that does not naturally contain 
any homologs to the pgl genes. PglX is predicted to be a 
DNA adenine methyltransferase and, in support of the 
Chinenova model, suggests a role for DNA methylation. 
There are no recognized motifs in PglZ but this protein may 
be modified, possibly by PglW, which is predicted to be a 
signaling protein containing Ser/Thr protein kinase 
domains and a HTH DNA binding motif (47, 97). How the 
proteins confer Pgl is unknown at the moment, as is the 
target in (j)C31 on which Pgl is presumed to act. No <j>C31 
derivatives have been isolated that are resistant to Pgl, 
suggesting that the target is either in multiple copies in the 
phage genome or is essential (62). As only phages in the same 
immunity grouping as c(>C31 were sensitive to Pgl, it was 
suggested that Pgl targeted the repressor-operator system, 
possibly the highly conserved operators or CIR sites (97). 
It has now been discovered that a heteroimmune phage, a 
derivative of (j)HAU 3, is sensitive to Pgl and this is currently 
being investigated further (J. Leafe and M. C. M. Smith, 
unpublished data). 

Another interesting feature of the Pgl system is a high 
frequency (approximately 1()~ 3 to 10 _ per spore) of phase 
variation from Pgl + —> Pgl and vice versa (62). Expansion 
and contraction of a G-tract within the pglX gene have been 
shown recently to account for Pgl phase variation in several 
strains of S. coelicolor (98). It was also proposed that the 
phase variation in the putative methyltransferase gene 
might be part of a mechanism that switches the target of 
Pgl and therefore the phage specificity. S. coelicolor encodes 
a paralog of pglX, called pglS, the product of which might be 
able to interact with the remaining Pgl proteins if pglX is not 


expressed and direct the Pgl phenotype to an alternative 
group of phage (98). 

How prevalent is Pgl? Recently homologs of pglWXYZ 
have been deposited in the databases from the genome 
sequencing of Thermobifida fusca, a mycelial thermophilic 
actinomycete (http://www.ncbi.nlm.nih.gov/genomes/). 
More distant relatives of the Pgl proteins can be detected 
using PSI-BLAST in the cyanobacterium Nostoc punciforme, 
the archaeon Methanosarcina, and in Providencia rettgeri, 
Vibrio cholera, and Salmonella typhimurium LT2. As in 
S. coelicolor and T. fusca, the genes encoding the Pgl-like 
proteins in these diverse organisms are clustered and in P. 
rettgeri and V. cholera they are located on mobile genetic 
elements. However, it is unknown whether any of these 
pgl-like clusters confers phage resistance. 

Other instances of possible Pgl-like phenotypes have been 
previously observed. Saccharopolyspora erythraea NRRL 
2359 appears to be sensitive to “lysis from without” by 
certain phage, that is SE-3 and 4>C69, but these phages 
cannot form plaques (55,93). If a highmoi of (j)C31is dropped 
onto a lawn of S. coelicolor M145 (Pgl + ) a zone of reduced 
bacterial growth is observed but the phage is incapable of 
forming plaques. 

Host-Phage Interactions in Soil 

Most of our knowledge of phage-host interactions comes 
from studies in a very unnatural environment: agar plates 
on rich media (discussed generally for phages in chapter 5). 
How do these studies relate to the natural environment — in 
the case of the streptomycetes, in soil? Important observa¬ 
tions on the persistence of phage, survival of lysogens, the 
infection cycle itself have been made by Prof. Wellington 
and coworkers (13, 29, 46, 67). As in the agar-plate experi¬ 
ments, rapid increases in phage counts were observed 
under conditions that would favor germination and mycelial 
outgrowth (29, 67). Thus phage multiplication was observed 
in a soil environment on inoculation of fresh soil with spores 
of lysogens — conditions which would support rapid germi¬ 
nation of the host and concomitant phage growth. The rise 
in phage numbers was followed by a fall, corresponding to 
the resporulation of the host. Smaller bursts were observed 
after addition of fresh soil or mixing. Equilibrium was soon 
established where the numbers of lysogens and free phage 
remained constant. Conversion of endemic bacteria to lyso¬ 
gens on inoculation of unsterile soil with spores of a lysogen 
occurred frequently (67), although this could be to the detri¬ 
ment of the inoculated lysogen (46). An unusual observation 
concerning the phag e-Streptomyces interactions in soil is 
that the host numbers do not vary in response to the 
numbers of phage (13). This is at odds with observations 
from the marine environment (see chapter 32 and 33 as 
well as chapter 5). The authors attribute the stability of the 
host to: (i) susceptibility of newly germinated spores rather 
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than older mycelium, (ii) substrate mycelium in mature colo¬ 
nies that adsorb approximately 98% of total phages thus 
protecting young mycelia, (iii) a large burst size in soil, and 
(iv) no measurable impact on growth of the host by the 
phage, possibly due to the spatial heterogeneity in the soil 
environment (13). 

Outlook 

As in other fields in biology, genomics will fundamentally 
enhance our understanding of Streptomyces phages. Some 
of the greatest opportunities for research consequently lie 
ahead. One important question that is currently being 
addressed concerns the total diversity of phage genomes. 
Recently snapshots of the phage gene pool from different 
environments have been taken (7, 8). Virus-sized particles 
were purified directly from the environment and fragments 
of their DNA sequenced without imposing a bottleneck 
through amplification in specific bacterial hosts. The results 
indicate tremendous diversity, with up to 65% of the 
sequences being unique. Another study comparing the 
genomes of 14 randomly isolated mycobacteriophages (see 
chapter 38) upholds this view of unprecedented diversity 
(73). There is no reason not to believe that the Streptomyces 
phages will be equally diverse. Genomics combined with 
ecological and laboratory-based approaches will continue 
to address the uniqueness of Streptomyces phage biology. 
What are the adaptations that Streptomyces phages have 
undergone that enable their growth and survival in their 
unusual hosts? How have phage evolved to an organism 
that has a mycelial growth habit and are there any phage- 
encoded genes that enhance or modify antibiotic synthesis? 
Conversely, how has the mycelial host evolved in response to 
phage infection? Finally, what role do phages have in the 
transfer of genetic information in the soil environment and, 
in the case of Streptomyces, in the evolution of antibiotic 
biosynthesis pathways? 
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Mycoplasma Phages 

JACK MANILOFF 
KEVIN DYBVIG 


M ycoplasma (with no italics) is the generic name for 
the small, wall-less bacteria that arose by degen¬ 
erate evolution from the Streptococcus branch of the Gram¬ 
positive bacteria phylogenetic tree (25). At present these 
microorganisms are grouped into one putative genus plus 
eight well-accepted genera — including the Acholeplasma, 
Spiroplasma, and Mycoplasma — together forming the order 
Mollicutes (table 40-1). Mycoplasma genera have diverse 
habitats and biochemical and physiological properties. 
The genomes of some species are only 600-800 kbp. 

The first mycoplasma phage was isolated in 1970 by 
R. N. Gourlay in England, using a bovine mycoplasma isolate 
as host cells and filtrates of bovine mycoplasma isolates 
as the phage source (16). Subsequently the host strain was 
identified as Acholeplasma laidlawii and the phage was iden¬ 
tified as a filamentous phage containing single-stranded 
DNA. Over the next few years Gourlay isolated enveloped 
quasi-spherical phages as well as short-tailed phages infect¬ 
ing A. laidlawii. Since then, other workers have isolated 
more Acholeplasma phages as well as phages infecting Spiro¬ 
plasma and Mycoplasma strains (23, 24). Phages active 
against other Mollicutes genera have not been isolated. 

All mycoplasma phages that have been isolated are 
DNA phages (table 40-2). Mycoplasma phages with circular, 
single-stranded DNA genomes can be icosahedral or quasi- 
spherical in addition to filamentous. Most double-stranded 
DNA mycoplasma phages have short tails and linear DNA 
with particular terminal features, though an enveloped 
quasi-spherical double-stranded DNA phage has also been 
isolated. Mycoplasma phages with long tails have been 
reported but not been propagated. The absence of myco¬ 
plasma cell walls means mycoplasma phage adsorption 
must resemble that of animal viruses (i.e., adsorption to cell 
membranes) rather than that of bacteriophages (i.e., adsorp¬ 
tion to cell walls or extracellular bacterial structures). 
In addition, since Spiroplasma and Mycoplasma genera 
evolved to use UGA as a tryptophan codon rather than a 
stop codon, Spiroplasma- and Mycoplasma- phage genomes 
must have coevolved with this codon change. 


Data on many mycoplasma phages have been 
reviewed and these references should be consulted for the 
original literature citations (23, 24, 32). In reviewing the 
early mycoplasma literature, it should be noted that there 
have been significant changes in mycoplasma taxonomy 
and nomenclature over the years. In keeping with current 
virus taxonomy (47), mycoplasma viruses are now referred 
to as mycoplasma phages although mycoplasma phage 
particles continue to be called virions. 


ssDNA Filamentous Phages 

Filamentous single-stranded (ss) DNA Acholeplasma phages 
are short, rod-shaped particles. There have been more 
than 50 reports of their isolation but only one, designated 
Acholeplasma phage L51 and infecting some A. laidlawii 
strains, has been characterized. Filamentous Spiroplasma 
phages, by contrast, are longer and more filamentous 
than the equivalent Acholeplasma phage. About 60% 
of Spiroplasma cultures produce filamentous phages. The 
first isolated was from Spiroplasma citri and designated 
Spiroplasma phage SpVl. 

Acholeplasma Phage L51 

Virion and Macromolecules 

L51 virions are non-enveloped, bullet-shaped particles, 
14 nm by 71 nm, with one end rounded and the other irregu¬ 
larly shaped or flat. Virions have helical symmetry with 
a subunit spacing of 4.8 nm. As regards virus taxonomy, 
phage L51 is the type species of the genus Plectrovirus 
(small filamentous phages) in the family Inoviridae (fila¬ 
mentous phages containing ssDNA) (47) (see chapter 2 for 
a discussion of phage classification). LI virions are resistant 
to treatment with the nonionic detergents Nonidet P-40 
and Triton X-100; with the ionic detergent Sarkosyl NL97; 
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Table 40-1 Properties of the Genera in the Class Mollicutes 3 

Oxygen Sterol 

Taxonomy requirement 13 requirement Genome size (kbp) Habitat 


Family: Mycoplasmataceae 
Genus: Mycoplasma 
Genus: Ureaplasma 
Family: Entomoplasmataceae 
Genus: Entomoplasma 
Genus: Mesoplasma 
Family: Spiroplasmataceae 
Genus: Spiroplasma 
Family: Acholeplasmataceae 
Genus: Acholeplasma 

Genus: Phytoplasma c 
Family: Anaeroplasmataceae 
Genus: Anaeroplasma 
Genus: Asteroleplasma 

a Based on data reviewed in (31). 
b FA, facultative aerobe; OA, obligate anaerobe. 

c Phytoplasmas have not been cultured and, therefore, have no official taxonomic status. These microorganisms may be obligate intracellular parasites. 
Phytoplasmas form a putative genus phylogenetically close to Acholeplasma (25). Although no phytoplasma growth data are available, genome sizes have been 
determined from phytoplasma-infected plant and insect tissues. 


Table 40-2 Properties of characterized mycoplasma phages 3 


Nucleic acid 

Morphology 

Phage 

Host 

Genome size 

Genome properties 

ssDNA 

Filamentous 

L51 

A. laidlawii 

4.3-4.5 kb b 

Circular 



SpVI 

S. citri 

6.8-8.3 kb c 

Circular 


Icosahedral 

SpV4 

S. melliferum 

4421 nt 

Circular 


Enveloped, quasi-spherical 

LI 72 

A. laidlawii 

14.0 kb 

Circular 

dsDNA 

Short-tailed phage 

L3 

A. laidlawii 

39.4 kbp 

Linear, circular permuted, 






terminally redundant 



SpV3 

S. citri 

21.0 kbp 

Linear, circular permuted, 






terminally redundant 



ai 

S. citri 

16.0 kbp 

Linear, cohesive ends 



PI 

M. pulmonis 

11,660 bp 

350 bp inverted terminal repeats, 






5'-terminal proteins 


Enveloped, quasi-spherical 

L2 

A. laidlawii 

11,965 bp 

Circular 


Not determined 

MAV1 

M. arthritidis 

15,644 bp 

Linear 


References in the text. 

b An L51 -related strain, Acholeplasma phage LI, has a genome of 4491 nucleotides (M. Jaeger and G. Klotz, unpublished data, GenBank Accession No. X58839). 
c As discussed in the text, three SpVI-related viruses have been seguenced, with genome sizes of 8273, 7768, and 6824 nucleotides. 


FA 

Yes 

580-1350 

Animals and humans 

FA 

Yes 

760-1170 

Animals and humans 

FA 

Yes 

790-1140 

Plants and insects 

FA 

No 

870-1100 

Plants and insects 

FA 

Yes 

780-2200 

Plants and insects 

FA 

Yes 

1500-1650 

Animals, some plants, 




and insects 

- 

- 

640-1185 

Plants and insects 

OA 

Yes 

1500-1600 

Bovine and ovine rumens 

OA 

No 

1500 

Bovine and ovine rumens 


with DNase I; and with pronase. Virions are also relatively 
heat and cold stable. On the other hand, virions are inacti¬ 
vated by UV irradiation with one-hit kinetics. 

The L51 circular, ssDNA genome is 4.3-4.5 kb. An early 
transfer of Gourlay’s original Acholeplasma phage isolate 
has been sequenced and shown to have a genome of 
4491 nucleotides (nt) (M. Jaeger and G. Klotz, unpublished 
data, GenBank Accession No. X58839). Purified L51 vir¬ 
ions, as analyzed by SDS-PAGE, contain four proteins of 
70, 53, 30, and 19 kDa while the L51 genome sequence 
encodes four putative proteins of 31, 23, 19, and 12 kDa. 


The relationship between these putative L51 proteins and 
those observed from L51 virions is not known. 

Growth and Replication 

Phage L51 adsorption to A. laidlawii cells follows pseudo- 
first-order kinetics. The experimentally determined adsorp¬ 
tion rate constant (3-6 x 1CP 9 cm 3 /min at 37°C) is essen¬ 
tially the same as the theoretical value calculated for 
single-hit kinetics, so most phage-cell collisions result in 
adsorption. However, competitive adsorption studies show 
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that only a small fraction of adsorbed L51 phages are 
bound to functional sites: of about 300 phages that can 
adsorb per colony forming unit (CFU), only 10-20 are 
bound to functional sites. There are limited data on the 
nature of A. laidlawii phage receptors, although a variety of 
cell membrane macromolecules have been proposed as 
receptors. 

Penetration of phage LSI ssDNA appears to be coupled 
to its conversion to replicative-form (RF) double-stranded 
(ds) DNA. Studies of L51-infected A. laidlawii cells show the 
L51 DNA replication cycle, like that of ssDNA bacterio¬ 
phages, proceeds in three steps: (i) synthesis of complemen¬ 
tary-strand viral DNA by host cell gene products, 
converting parental ssDNA to RF dsDNA; (ii) RF replication 
by a rolling-circle mechanism requiring host cell and phage 
gene products to produce progeny RF dsDNA; and (iii) synth¬ 
esis of progeny virus ssDNA from RF dsDNA by asym¬ 
metric DNA replication. An A. laidlawii REP~ mutant has 
been isolated that can propagate dsDNA Acholeplasma 
phages but not ssDNA Acholeplasma phages. In infections 
using these cells, L51 parental ssDNA is converted to intra¬ 
cellular RF dsDNA, but there is a block in RF replication 
and no progeny RF dsDNA is formed. An A. laidlawii 14 kDa 
protein is missing in REP~ cells and may be a cell gene 
product required for RF replication. Phage L51 assembly 
and release has been only minimally studied. An L51 DNA- 
protein complex has been identified late in infection. The 
complex contains L51 progeny ssDNA and L51 70 and 
53 kDa structural proteins, but in different stoichiometric 
ratios than in L 51 virions. 

Growth studies originally indicated that L51 phage has 
a noncytocidal infectious cycle leading to persistently 
infected cells. In agreement with this model, there is no loss 
of viability of L51-infected A. laidlawii cells up to an MOI of 
10. However, infected cells grow slower and make smaller 
colonies than uninfected cells, which probably explains 
the turbid plaques formed by L51 phage. Data on the prop¬ 
erties of A. laidlawii cultures also indicate that A. laidlawii 
strains persistently infected with LSI-related phages are 
prevalent in nature. 

One-step growth curves of L51-infected A. laidlawii cells 
show a 10-15 minute latent period followed by increasing 
plaque-forming units (PFU) over the next 2-3 hours (the 
rise period), with a yield of 150-200 progeny phage per 
infectious center at 2-3 hours post-adsorption. The latent 
and eclipse periods are indistinguishable in artificial- 
lysis experiments, indicating progeny phage assembly and 
release are coupled. The rate of virus release decreases 
2-3 hours post-adsorption but the progeny phage titer 
continues to increase slowly, with no measurable loss in 
CFU. These growth studies were the first indication that 
L51 has a noncytocidal infectious cycle. See chapter 5 for a 
general discussion of phage life-cycle characters of infection 
and release. 


Spiroplasma Phage SpVl 

Virion and Macromolecules 

As regards virus taxonomy, SpVl-related phages are classi¬ 
fied as species in the genus Plectrovirus in the Family 
Inoviridae (47). This classification is clearly wrong because 
SpVl-related phages are significantly different from the 
Plectrovirus type species, Acholeplasma phage L51 (above), 
in both virion morphology and genome. It can be expected 
that SpVl-related phages will be reclassified, probably 
as a separate genus in the family Inoviridae. Although 
several SpVl-related phage isolates (e.g., SpVl-R8A2B, 
SpVl-C74, and SVTS2) have been studied, the available 
data on the basic virology of SpVl-related phages are 
still limited. 

SpVl virions are long, non-enveloped, filamentous parti¬ 
cles, 10-15 nm by 230-280 nm, with one end rounded and 
a flat plate at the other (23, 24). Micrographs also show 
filamentous particles, 10-15 nm in diameter, with lengths 
2 or more times longer than virions. SpVl virions have a 
density of 1.39 g/cm J in CsCl and 1.21 g/cm 3 in metrizamide. 
SpVl biological activity is resistant to the nonionic deter¬ 
gents Nonidet P-40 and Triton X-100: sensitive to chloroform 
and ether; and heat stable. The genomes of three SpVl- 
related phages have been sequenced and analyzed: SpVl- 
R8A2B, SpVl-C74, and SVTS2. 

Phage SpVl-R8A2B has a circular, ssDNA genome of 
8273 nucleotides (nt) with a 22.9% G + C content, some¬ 
what less than the 26% G + C content of its S. citri host 
cells (32). From the sequence, 14 open reading frames 
(ORFs) were identified on the viral strand (the strand in the 
virion, which is the + strand) and four putative ORFs 
were identified on the complementary strand. One of the 
putative viral-strand gene products has significant similar¬ 
ity to transposases of the insertion sequence 30 (IS 30) 
family (29). 

Phage SpVl-C 74 has a circular, ssDNA genome of 7768 nt, 
with 13 putative ORFs encoded in the viral strand and two 
in the complementary strand (32). Eleven viral-strand 
ORFs and two complementary-strand ORFs are conserved 
between the SpVl-R8A2B (above) and SpVl-C74 genomes. 
The putative SpVl-R8A2B IS 10 transposase ORF is not 
found in the SpVl-C74 genome. However, one of the SpVl- 
C74 ORFs, not found in the SpVl-R8A2B genome, has signi¬ 
ficant sequence similarity to transposases of the insertion 
sequence 3 (IS3) family (29). 

Phage SVTS2 has a circular, ssDNA genome of 6824 nt, 
with a G + C content of 22.7% (38). The viral strand encodes 
14 putative ORFs. Nine of these have sequence similarity 
to ORFs found in both phages SpVl-R8A2B and SpVl-C74. 
The SVTS2 genome lacks an ORF with recognizable simil¬ 
arity to a transposase. There is also sequence similarity 
between an ORF encoding a gene product of unknown 
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function identified in all three SpVl-related phages and an 
ORF in Acholeplasma phage LI. 

Growth and Replication 

There are limited data on SpVl phage growth (23, 24). 
One-step growth curves of SpVl-infected S. citri host cells 
show a 1 hour latent period followed by increasing PFU over 
at least the next 6 hour. This suggests Spiroplasma phage 
SpVl has a noncytocidal infectious cycle similar to those 
of the filamentous ssDNA Acholeplasma phage L51 and 
the ssDNA isosahedral Spiroplasma phage SpV4. 

Multiple copies of full-length and partial genomes of 
SpVl and SpVl-related phages have been found in strains of 
most Spiroplasma species examined (2, 32). The presence 
of phage-encoded transposases and inverted repeats at the 
termini of phage fragments integrated in cell chromosomes 
suggests that RF dsDNAs of phages SpVl-R8A2B and SpVl- 
C74 function as insertion elements of the IS 30 and IS 3 
families, respectively (2, 29). The extent of integrated SpVl- 
related phage sequences implies these phages may have a 
significant effect on Spiroplasma chromosome evolution. 

Following phage SVTS2 infection of S. citri, resistant cells 
were selected and found to contain a 2.1 kbp fragment of 
the SVTS2 genome integrated into chromosomal and extra- 
chromosomal DNA. Transfection of this fragment into 
phage SVTS2-sensitive S. citri transformed the host cell 
phenotype to SVTS2-resistance. This resistance phenotype 
appears to be due to inhibition of intracellular phage DNA 
replication rather than to a change in cell surface phage 
receptors (37). 

ssDNA Icosahedral Phage 

Spiroplasma Phage SpV4 

Virion and Macromolecules 

Spiroplasma phage SpV4 was isolated from and propagated 
in Spiroplasma melliferum strains (23, 32). Virions are 
icosahedral (T=l) particles that are 27 nm in diameter with 
20 protrusions, each 5.4 nm long, making SpV4 the only 
known icosahedral mycoplasma phage (7). As regards 
virus taxonomy, Spiroplasma phage SpV4 is the type species 
of the genus Spiromicrovirus in the family Microviridae 
(small icosahedral phages containing ssDNA) (47). 

The family Microviridae contains four genera: Micro¬ 
virus (host: Enterobacteriaceae), Spiromicrovirus (host: 
Spiroplasma), Bdellomicrovirus (host: Bdellovibrio), and 
Chlamydiamicrovirus (host: Chlamydia). It has been pro¬ 
posed recently that these four genera form two subfamilies 
within the Microviridae, one subfamily consisting of the 
genus Microvirus and the other subfamily containing of 


the other three genera (4). Three shared features of this 
latter proposed subfamily are seen with the Spiroplasma 
phage SpV4: (i) the capsid lacks spikes, (ii) external head¬ 
scaffolding proteins are also lacking (4), and (iii) the major 
SpV4 coat protein is more complex than that of Microvirus 
phage (e.g., <f>X174). In particular, the SpV4 coat protein 
contains an insertion loop that forms the 5.4 nm protrusions 
that may function similar to Microvirus spike proteins (7). 

SpV4 virions have a density of 1.40g/cm J in CsCl and 
1.24 g/cm’ in metrizamide. SpV4 biological activity is resis¬ 
tant to the nonionic detergent Triton X-100, the ionic deter¬ 
gent sodium dodecyl sulfate, chloroform, ether, DNase, 
RNase, and proteinase K. The SpV4 virion nevertheless is 
more heat-sensitive than the ssDNA filamentous phages 
infecting the Acholeplasma. 

Spiroplasma phage SpV4 has a circular, ssDNA genome 
of 4421 nt with a G-C content of 32%—higher than its host 
S. melliferum’s 26% G-C content (32). A total of nine ORFs 
are encoded, all on the viral strand (the strand in the virion 
that is also the + strand). ORF1 encodes the 64.0 kDa 
major capsid protein (corresponding to the 60 kDa virion 
protein identified by SDS-PAGE) which displays significant 
sequence and structural similarity to the major capsid pro¬ 
teins of Bdellomicrovirus and Chlamydiamicrovirus phages. 
Two other SpV4 ORFs have significant similarities to Chla¬ 
mydiamicrovirus ORFs, one of which is a putative endo¬ 
nuclease. The nine SpV4 ORFs would produce proteins of 
64.0, 32.1, 17.3, 14.1, 9.5, 8.5, 5.6, 4.9, and 3.8 kDa. Only the 
64.0 kDa protein, the major capsid protein, has been 
identified in virions or in SpV4-infected cells. 

Growth and Replication 

One-step growth curves of SpV4-infected S. melliferum show 
a 1-2 hour latent period followed by a 4 hour period of rapid 
progeny phage release. The progeny phage yield gradually 
reaches a plateau at 20 hour post-adsorption of 100-200 
phage progeny per infectious center. These results, together 
with the fact that SpV4 produces clear plaques, indicate that 
SpV4 has a nonlytic, cytocidal productive infection cycle in 
which infected cells are no longer viable but continue to 
release progeny virions for many hours. This type of infec¬ 
tious cycle is different from the lytic infections of phages in 
the other Microviridae genera, but similar to that found 
for the short-tailed mycoplasma phages (described below). 

Sequence analysis of the SpV4 genome indicates that 
DNA replication, transcription, and translation are similar 
to those of other ssDNA phages. An SpV4 intergenic inverted 
repeat sequence, with seven GC base pairs, may form the 
hairpin structure used by RNA polymerase or primase for 
synthesis of the RNA primer for complementary strand 
synthesis as in other ssDNA phages. Each of the nine 
SpV4 ORFs has an upstream Shine-Dalgarno sequence, and 
eight ORFs start with an ATG codon and one with a GTG 
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codon. Transcription appears to use promoter and rho- 
independent termination sites similar to those in Gram¬ 
positive bacteria. 

Four SpV4 mRNAs (with sizes 2.7, 3.4, 4.4, and 7.8 kDa) 
have been identified in SpV4-infected host cells at all 
times post-adsorption. The latter mRNA is longer than one 
genome in length. It has been proposed that SpV4 transcrip¬ 
tion starts at different promoters and stops at the single 
rho-independent termination site identified by genome 
sequence analysis. In this model, the 3.4 and 7.8 kDa 
mRNAs start at the same promoter, but synthesis of the 
7.8 kDa mRNA reads through the termination site, con¬ 
tinues and transcribes the entire genome, and stops at 
the termination site when it reaches it again. 

dsDNA Short-Tailed Phages 

One of Gourlay's original Acholeplasma phage isolates 
was a virus morphologically similar to coliphage T7 (see 
chapter 20): short-tailed phages containing linear dsDNA. 
This was a surprising morphology for a mycoplasma phage 
because mycoplasma cells have no cell wall and it was not 
expected that phage tails would have a role in adsorption 
in such situations. This original short-tailed mycoplasma 
phage was designated L3. Another, apparently similar 
short-tailed Acholeplasma phage has been reported but 
only L3 has been characterized (23, 24). 

Short-tailed phages have also been found in over 60% 
of Spiroplasma strains examined, representing several 
Spiroplasma species. Based on genome size and structure 
(table 40-2), there are probably three types of short¬ 
tailed Spiroplasma phages: SpV3, ai, and SRO phages. Only 
limited data are available on these phages (23, 24). 

Over 100 Mycoplasma species have been recognized (31). 
For the vast majority of these species, no evidence of 
phage exists. The relative lack of known Mycoplasma phages 
is hardly surprising given that most investigators do not 
design experiments with the goal of identifying novel 
phages. Also, the fastidiousness of mycoplasmas can 
render the original isolation and subsequent characteriza¬ 
tion of phages technically challenging. The only Mycoplasma 
phage that has been characterized is PI, a short-tailed 
dsDNA phage isolated from the murine pathogen, Myco¬ 
plasma pulmonis. 

Acholeplasma Phage L3 

Virion and Macromolecules 

L3 virions have a polyhedral head with a diameter 
about 60 nm, a collar about 8 nm thick and 16 nm wide, 
a short tail about 10 nm wide and 20 nm long, and fibers 
attached to the collar. As regards virus taxonomy, 
Acholeplasma phage L3 is an unassigned species in the 


family Podoviridae (short-tailed phages) (47). L3 virions 
have a density of 1.477 g/cm J in CsCl, and are resistant to 
treatment with the nonionic detergents Nonidet P-40 and 
Triton X-100, and are also resistant to ether. Virions are 
inactivated by UV irradiation with one-hit kinetics and 
are relatively heat-sensitive. 

Electron microscopic measurements found L3 virions 
contain linear dsDNA with a size of 39.4 kbp. Subsequent 
studies showed that this 39.4 kbp DNA consists of a 36.2 kbp 
L3 genome plus about 8% terminal redundancy. L3 DNA 
also has limited circular permutation, producing a circular 
36.2 kbp restriction endonuclease map and suggesting 
that assembly may involve packaging from pac sites. 
Purified L3 virions contain 19 proteins as determined by 
SDS-PAGE. The major protein is 43 kDa and may be the 
major capsid protein. Complementation tests using L3 
temperature-sensitive mutants have shown at least 21 
complementation groups. 

An interesting aspect of the L3 genome is that it con¬ 
tains no GATC sequences although it would be expected 
to have about 150 GATC sites. The absence of GATC sites 
was first suggested by the observation that L3 phages are 
not restricted by an A. laidlawii strain that restricts phages 
with DNA containing GATC sequences, and confirmed 
by studies showing that L3 DNA is resistant to restriction 
endonucleases that cleave unmethylated and methylated 
GATC sites. Hence, like a number of other phages, L3 has 
evolved under selective pressure for the loss of sequences 
recognized by host cell restriction endonucleases. 


Growth and Replication 

L3 adsorption to A. laidlawii cells follows pseudo-first-order 
kinetics with a rate constant of about 3 x 10 10 cm 3 /min 
at 37°C. The rate constant can vary by a factor of 5 depend¬ 
ing on the cells and media. The theoretical value calcu¬ 
lated for single-hit kinetics is 2 x 10~ 9 cm 3 /min. Therefore, 
depending on the experimental conditions, anywhere from 
a few percent to most phage-cell collisions result in 
adsorption. L3 adsorption requires Ca 2+ , which cannot be 
replaced by Mg 2+ or monovalent cations, and has been 
proposed to involve electrostatic interactions between 
virion and cell-membrane proteins. 

Adsorbed L3 virions behave like polyvalent ligands: 
they diffuse along the cell membrane and crosslink recep¬ 
tors. It has been proposed that the fibers attached to the 
L3 virion collar function as polyvalent determinants. As 
adsorbed L3 virions diffuse on the cell membrane, the fibers 
crosslink mobile receptors producing capping or cluster¬ 
ing of adsorbed virions. This capping process is dependent 
on energy metabolism by the adsorbed cell. Adsorbed L3 
virions can also crosslink receptors on different cells, 
promoting cell fusion to produce the giant cells seen in 
L3-infected cultures. 
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The rate of total DNA synthesis in L3-infected cells 
decreases post-adsorption, but continues at a measurable 
rate for at least 17 hours. By 1 hour post-adsorption, about 
40% of nascent DNA is phage DNA; by 2 hours, about 
60-80% of nascent DNA is phage DNA; and by 3 hours 
essentially all nascent DNA is phage DNA. L3 must not 
obtain precursors for progeny DNA synthesis by re-utilizing 
host cell DNA nucleotides because: (i) although the host cell 
chromosome is progressively unfolded and possibly frag¬ 
mented during L3 infection, no significant amount of acid- 
soluble cell DNA is released; (ii) no significant amount of 
cell DNA is recovered in L3 progeny; and (iii) if cell DNA 
nucleotides were the only L3 DNA precursors, then only 
about 100 progeny L3 could be produced, although many 
L3-infected cells produce more than 500 progeny phages. 
The small genome of L3 phage argues against it encod¬ 
ing gene products for de novo nucleotide synthesis, like 
those found in large genome phages. Instead, L3 DNA 
precursors must come from the salvage pathways used by 
mycoplasmas for nucleotide synthesis, in which medium- 
supplied free bases and nucleosides are converted to nucleo¬ 
tides. Salvage-pathway enzymes may also be encoded by 
L 3 phage. 

Intracellular L3 DNA replication involves fast- 
sedimenting complexes, indicating formation of concate- 
mers. These may be formed by recombination between the 
terminally redundant ends of linear L3 DNA molecules. As 
for DNA synthesis, the rates of total RNA and protein 
synthesis in L3-infected cells decrease post-adsorption, but 
both continue at measurable rates for at least 17 hours. 
Approximately 20 phage-specific proteins (including 10 
virion proteins) have been identified in L3-infected cells by 
SDS-PAGE. However, the long labeling times needed for 
these studies have precluded studies of the temporal regula¬ 
tion of L3 gene expression. L3 DNA packaging is proba¬ 
bly from concatemers via a pac-site cutting mechanism 
that generates linear progeny phage DNAs with terminal 
redundancy and limited circular permutation. 

Growth of L3-infected cells stops at the start of infec¬ 
tion. One-step growth curves of L3 infected A. laidlawii 
show a 90 minute latent period followed by a linear increase 
in progeny phage that continues for about 15 hours. This 
also was shown by “single burst” experiments in which 
the average phage yield per infected cell was found to 
increase with time post-adsorption, and by measure¬ 
ments of phage production by individual cells as a function 
of time post-adsorption. The latter studies found that every 
L3-infected cell that begins to release progeny phage during 
the first few hours post-adsorption continues to do so for 
at least 24 hours post-adsorption. Artificial lysis experi¬ 
ments show a 60 minute eclipse period followed by intracel¬ 
lular accumulation of infectious progeny phage. Since L3 
produces clear plaques, these data suggest that Acholeplasma 
phage L3 produces a cytocidal infection with progeny phage 
release by a nonlytic mechanism. 


By 8 hours post-adsorption, the cytoplasmic face of 
the host membrane is covered by L3 virions, oriented 
radially with tails facing the membrane. Extracellular 
membrane vesicles enclosing one or more L3 progeny viri¬ 
ons are seen in L3-infected cell preparations. It is not 
known whether these vesicles are a budding mechanism 
for progeny L3 virions release or whether there is some 
other means for the release of non-enveloped L3 virions. 
L3 phage-containing vesicles somehow must bud from 
infected cells without destroying cell integrity, since infected 
cells continue to release progeny phage for many hours. 
The vesicles may eventually break down to release non- 
enveloped L3 virions. 

With the exception noted below, L3 is the only Achole¬ 
plasma phage reported, thus far, that is able to propagate on 
an Acholeplasma species other than A. laidlawii. L3 forms 
plaques on lawns of A. laidlawii. A. modicum, and A. oculi. 
The exception noted above is recent studies which indi¬ 
cate that Acholeplasma phage L2 can be propagated on 
A. laidlawii and A. oculi strain ISM1499 (K. Dybvig, unpub¬ 
lished data). 

Spiroplasma Phage SpV3 

Virion and Macromolecules 

SpV3 has been isolated from S. citri and S. mirum strains. 
Virions have a polyhedral head, about 40 nm in diameter, 
and a short tail, 13-8 nm by 6-8 nm. As regards virus 
taxonomy, Spiroplasma phage SpV 3 is an unassigned species 
in the Family Podoviridae, where it is listed as phage C3 (an 
earlier designation) (47). SpV3 virions have been reported 
to have a density of 1.45 g/cm 3 in CsCl. However, since it 
has also been reported that banding in CsCl causes partial 
SpV 3 dissociation and DNA release, the density data must 
be questioned. The SpV 3 genome is linear dsDNA, 21kbp 
in size with about 5% terminal redundancy and circular 
permutation. Preliminary data indicate the SpV3 G-C 
content is 27%, experimentally the same as the 26% G+C 
content of S. citri DNA. Different numbers and sizes of 
proteins have been reported for SpV3. It is not known 
whether this is due to differences between different isolates 
or experimental differences between different laboratories 
studying the same phage. 

Growth and Replication 

Electron microscopic studies show accumulations of 
intracellular progeny virions and membrane-bounded 
budding phages. This suggests that release of progeny 
phage from SpV3-infected cells involves budding similar 
to the type of release described above for Acholeplasma 
phage L3. No additional data are available on SpV 3 
replication. 
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Spiroplasma Phage ai 

Virion and Macromolecules 

Phage ai has been isolated from S. citri and has a morpho¬ 
logy similar to that of phage SpV3: polyhedral heads, 
43-54 nm in diameter, and 14 nm long tails. Phage ai as 
well as SRO phages (discussed below) have not been formally 
classified but presumably are members of family Podoviridae. 
Phage ai virions have a density of 1.45 g/cm 5 in CsCl and 
are resistant to treatment with the nonionic detergent 
Triton X-100. The phage ai genome is linear DNA of 16 kbp 
with cohesive ends. Linear, circular, and concatemeric 
ai DNA has been identified in ai-infected cells. Heating and 
SI nuclease treatments convert the circular and concate¬ 
meric DNA to linear 16 kbp DNA. The virions of ai and 
two fli-related phages have seven proteins of sizes 86, 65, 63, 
55,47,45, and 26 kDa. 

Growth and Replication 

Phage ai seems to have a nonlytic, cytocidal productive 
infection cycle similar to that described above for Achole- 
plasma phage L3 but, unlike L3, phage ai can also have a 
lysogenic cycle. One-step growth and artificial lysis experi¬ 
ments show the ai productive infection cycle has an 
eclipse period of about 3 hours, after which time intra¬ 
cellular accumulation of progeny virions begins. The 
ai latent period is 5-6 hours, after which time progeny- 
phage release begins. Progeny phage release continues 
until about 11 hours post-infection, yielding 40 progeny 
ai per infected host. 

All S. citri strains that have been examined contain 
crytpic ai prophage consisting of less than full-length ai 
DNA fragments integrated in the cell chromosome. Some 
(iz-infected cells produce ai-lysogens with an ai genome 
integrated in the cell chromosome within a cryptic ai 
prophage, presumably by homologous recombination. Such 
lysogens are immune to superinfection by homologous 
phages but can be infected by heterologous phages. 
Although some ai phage is spontaneously released from 
lysogens, the lysogenic phenotype is retained after repeated 
lysogen passage in media containing antiserum to ai 
phage. Hence, the lysogenic phenotype appears not to be 
due to some type of persistent infection. Lysogens do not 
adsorb ai phage. This indicates that immunity to super¬ 
infection in ai lysogens is similar to that for Salmonella 
phage lysogeny, where immunity involves modification of 
cell-surface phage receptors. 

Spiroplasma SRO Phages 

SRO (sex ratio organisms) are spiroplasmas that infect 
some Drosophila and affect the sex ratio of their progeny. 
These SRO spiroplasmas are infected by short-tailed 


phages. Unfortunately, the few studies of SRO phages were 
done before a medium for growing SRO was developed, so 
the data are limited to preparations isolated from SRO- 
infected Drosophila (23). 

Virion and Macromolecules 

Several SRO phage strains have been studied. All have 
a short-tailed phage morphology with a polyhedral head, 
35-45 nm in diameter, a short tail, 10-12 nm by 7-9 nm, 
and a baseplate. These dimensions are similar to those 
of Spiroplasma phages SpV3 and ai. SRO phage prepara¬ 
tions have been reported to contain linear dsDNAs of 17, 
21.8, and >30 kbp. In some cases, more than one of these 
DNAs have been found in the same SRO phage prepara¬ 
tion. Restriction endonuclease cleavage and hybridization 
studies of these DNAs have produced confusing results, as 
expected for studies of phages that have not been cloned or 
purified. 

Growth and Replication 

Like SpV3, electron microscopic studies of phage-infected 
SRO cells show accumulations of intracellular progeny 
virions and membrane-bounded budding phages, suggest¬ 
ing that progeny SRO phages are released from infected 
cells by budding. 

Injection of SRO spiroplasmas from one Drosophila 
species into another SRO spiroplasma-infected Drosophila 
species can produce interference with the sex ratio trait, in 
some cases leading to loss of the trait and return to a 
normal progeny sex ratio in the recipient Drosophila or in 
their daughters. This loss of pathogenicity appears to 
be caused by clumping of the two SRO spiroplasmas and 
eventual lysis of the SRO in the recipient Drosophila. Elimina¬ 
tion of the sex ratio trait has been shown to be phage- 
mediated. Preliminary studies have shown that only SRO 
phage-infected spiroplasma cells produce this effect and 
that the SRO phage determines the clumping specificity 
of its host cells. SRO phage-infected spiroplasmas only 
clump with spiroplasmas from a different Drosophila species. 
This phenotype change in SRO phage-infected spiro¬ 
plasmas may be similar to the lysogenic conversion and 
modification of the host cell surface described above for 
Spiroplasma phage ai. 

Mycoplasma Phage PI 

Virion and Macromolecules 

PI virions have a short tail and an isometric head only 
28 nm in diameter (11). As regards virus taxonomy, Myco¬ 
plasma phage PI is presumably a member of the family 
Podoviridae (47), although it has not been formally classified. 
PI infectivity is resistant to treatment with the nonionic 
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detergent Triton-X 100, chloroform, DNase I, and RNase A. 
Infectivity is lost by treatment with sodium dodecyl sulfate, 
deoxycholate, and proteinase K (11). 

PI virions contain a linear, dsDNA genome of 11,660 bp 
with inverted terminal repeats of 350 bp (46, 52). The PI 
DNA G-C content is 26.8% G-C, experimentally indistin¬ 
guishable from its host’s 26.6% G-C content (6). From the 
complete nucleotide sequence, 11 putative PI genes were 
identified (46). These are organized such that transcrip¬ 
tion can be initiated within each inverted terminal repeat 
and proceed toward the middle of the genome, with six 
genes being transcribed from one terminus and five from 
the other. Consistent with this gene organization, puta¬ 
tive promoters oriented inward were identified within the 
inverted terminal repeats and a transcription terminator 
was identified in the middle of the genome where the two 
opposing RNA polymerase molecules would meet. 

Considerable evidence indicates a terminal protein is 
attached to the 5'-terminus of each PI DNA strand (52). 
Initial attempts to identify PI DNA by agarose gel electro¬ 
phoresis of nucleic acids extracted from PI virions were 
unsuccessful. Pretreatment with proteinase K resulted in 
PI DNA that migrated in agarose gels as linear molecules. 
PI DNA was resistant to digestion with phage A, exo¬ 
nuclease, an enzyme that specifically degrades from DNA 
5'-termini. Electron microscopic analysis showed the 
presence of globular material at the ends of the PI genome. 
The globular material, presumably protein, was absent on 
PI DNA that had been treated with proteinase K. PI genome 
termini also have a low level of nucleotide sequence similar¬ 
ity to the termini of the genome of phage <f>29, a well- 
characterized phage that has terminal proteins attached 
to the ends of its linear dsDNA genome (36) (see chapter 22). 

PI virion proteins have not been studied. Tentative 
assignments have been suggested for several proteins 
predicted from the PI genome sequence (46). The PI 0RF8 
gene product has a collagen-like repetitive motif similar to 
that of some bacteriophage tail fiber proteins (44). PI ORF1 
is predicted to encode a phage DNA replication protein. 
The putative product of PI ORF3 has a low level of amino 
acid sequence similarity to the terminal protein of (j)29 and 
related phages, but experimental verification is needed 
to conclude that PI ORF3 actually encodes a terminal 
protein. The other ORFs in the PI genome encode proteins 
with no significant sequence similarity to proteins in 
sequence databases. 

Growth and Replication 

The pseudo-first-order rate constant for PI adsorption is 
very low, 2 x 1CP 11 cm 3 /min at 37°C. PI infection is cytocidal. 
Host variants to which PI fails to adsorb can be isolated 
at a high frequency (10). M. pulmonis cells produce a family 
of phase-variable lipoproteins referred to as V-l antigens 
or Vsa proteins (3, 39). Each cell can produce only one Vsa 


protein because the genome has a single vsa expression 
site that promotes transcription and translation of the gene 
that occupies this site (3, 6, 39). Alternative vsa genes can 
become expressed as a result of site-specific DNA inver¬ 
sions that bring a gene into the expression site while 
moving the previously expressed gene to a silent site. PI 
virus adsorbs to cells that produce the VsaA protein. Cells 
that produce VsaB are resistant to infection due to failure of 
the virus to adsorb (10). Cells that produce VsaC and VsaE 
proteins are also resistant to PI infection, presumably due 
to lack of PI adsorption (19). Because PI virus can infect 
only cells that produce VsaA, VsaA is a candidate for 
being the virus receptor. All Vsa proteins have an identical 
N-terminal domain of 242 amino acids. The unique feature 
of the VsaA protein is a 17 amino acid sequence that is 
repeated in tandem about 40 times (3). This repetitive 
sequence may interact with PI virions, possibly through 
the collagen-like repeat of the PI 0RF8 gene product. 

PI was originally observed as phage-like particles in elec¬ 
tron micrographs of M. pulmonis strain 5884 (11). Several, 
but not all, strains of M. pulmonis can serve as phage 
indicators, with strain UAB 6510 and its derivatives being 
used for PI isolation and characterization (11). Plaques 
are unusually heterogeneous in size and turbidity, an obser¬ 
vation that can be explained by high-frequency changes 
in the host. Variations in M. pulmonis Vsa surface proteins 
affect the ability of PI to adsorb to host cells (12, 19), and 
variations in the production of phase-variable restriction- 
modification systems affect the ability of progeny phage 
to successfully infect neighboring cells in a developing 
plaque (12). Thus, the dynamics of interactions between PI 
phage and host cells are complex. One-step growth curves 
have a latent period of 1.5 hours and a rise period over 
4 hours (11). By 6 hours post-infection, 100-500 progeny 
phage have been released per infectious center. 

The terminal protein probably serves as the primer for 
PI DNA replication as it does for other phages with termi¬ 
nal proteins (35). However, intracellular replication, assem¬ 
bly, and release of PI virions have not been studied. 
Numerous virus-like particles that appeared to be intra¬ 
cellular and in close proximity to the cytoplasmic mem¬ 
brane were noted in the initial electron micrographs of 
M. pulmonis strain 5884 (R. Miles, personal communica¬ 
tion). Therefore, in contrast to the short-tailed phages of 
Acholeplasma (23, 24), PI virions may be released from 
infected cells in a single burst as opposed to budding in a 
non-lytic manner. 

dsDNA Quasi-spherical Phages 

One of Gourlay’s original Acholeplasma phage isolates was 
an enveloped quasi-spherical virion containing dsDNA 
and infecting some A. laidlawii strains. This isolate was 
designated Acholeplasma phage L2 and, although other 
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apparently similar isolates have been reported, only L2 has 
been studied in detail (23, 24, 28). 

Acholeplasma Phage L2 

Virion and Macromolecules 

Acholeplasma phage L2 has a unique morphology: virions 
are enveloped, quasi-spherical particles, 50-125 nm in 
diameter. The broad virion size range is due to heterogeneity 
of L2 preparations. Infection by an L2 virion results in 
the production of three morphological forms of progeny 
phages, designated L2-I, L2-II, and L2-III. These are pro¬ 
duced in about the same relative amounts — the ratio of 
L2-I to L2-II to L2-III is 4-16 to 2-3 to 1 — regardless of 
which form is used to start an infection. The virion 
diameters of L2-I, L2-II and L2-III are 74, 88, and 132 nm, 
respectively, based on measurements in aqueous media 
that yield slightly larger sizes than electron microscopic 
measurements. 

All three L2 forms contain unit-length DNA molecules, 
so the larger forms are not the result of concatemer packag¬ 
ing. However, UV inactivation studies indicate that, while 
L2-I and L2-III contain one genome copy per virion, L2-II 
contains two to three genome copies per virion. Electron 
micrographs show virions have a quasi-spherical, densely 
stained core bounded by a membrane. These data and 
the absence of any defined capsid structure have led to the 
proposal that L2 virions are nucleoprotein condensa¬ 
tions within a lipid-protein membrane. The lipid composi¬ 
tion and that of the A. laidlawii cell membranes have been 
studied in several laboratories, but there is no consensus 
on the relative amounts of the major lipid classes (glyco- 
lipids, phospholipids, phosphoglycolipids, and neutral 
lipids). L2 virion and host cell membranes have been shown 
to have essentially identical fatty acid compositions. 

As regards virus taxonomy, Acholeplasma phage L2 is 
the type species (and only classified member) of the Genus 
Plasmavirus in the family Plasmaviridae (enveloped quasi- 
spherical phages containing dsDNA) (47). Efforts to measure 
L2 density have been confounded by dehydration of virions 
within the sucrose gradients used for the experiments, and 
a true equilibrium density has yet to be determined. L2 
biological activity is sensitive to treatment with nonionic 
detergents Brij-58, Triton X-100, and Nonidet P-40: sensi¬ 
tive to ether: and sensitive to chloroform. Virions are also 
sensitive to treatment with pronase and trypsin, but not to 
DNase I or to phospholipase A 2 . L2 virions are extremely 
heat-sensitive, but relatively cold-stable. 

L2 virions contain circular dsDNA with a genome size 
of 11,965 bp and 32.0% G-C content, which is experimen¬ 
tally indistinguishable from the A. laidlawii 31.8% G-C (28). 
Sequence analysis has identified 15 putative ORFs, all 
on one DNA strand, designated ORF1 to ORF14 and 
ORF13 . The ORFs are clustered in four groups separated by 


noncoding intergenic regions. Sequence analysis of the L2 
genome indicates that transcription uses promoter- and 
rho-independent termination sites, similar to those in 
Gram-positive bacteria, to produce polycistronic mRNAs 
of gene clusters. Several possible cases of transcriptional 
read-through have also been identified. Each of the 15 L2 
ORFs has an upstream Shine-Dalgarno sequence and starts 
with an ATG codon. Translational coupling or reinitia¬ 
tion may be involved in translation of the polycistronic 
mRNAs. 

The gene product of ORF 5, the only L2 ORF with signifi¬ 
cant sequence similarity to any database sequence, is the 
putative L2 integrase based on its sequence similarity to 
site-specific DNA recombinases. ORF13 and ORF14 contain 
N-terminal signal sequences of 27 and 26 amino acids, 
respectively, indicating that their putative gene products 
may be integral membrane proteins. 0RF13* has its start 
codon within and in the same reading frame as ORF13 and, 
therefore, the 0RF13* gene product must be the 443 amino 
acid C-terminal fragment of the 738 amino acid ORF13 
gene product. The putative 0RF12 gene product is a 17 kDa 
basic protein, which may be the major virion 19 kDa pro¬ 
tein identified by SDS-PAGE. In agreement with the sugges¬ 
tion that the 19 kDa protein is a DNA-binding protein, the 
ORF12 putative gene product is the only L2 gene pro¬ 
duct with extensive helix-turn-helix motifs, characteristic 
of DNA-binding proteins. 

Analysis of purified L2 virions by SDS-PAGE shows 
four proteins: 64, 61, 58, and 19 kDa. The 64 kDa protein 
appears to be an L2 integral membrane protein based on 
its reaction with a variety of reagents and may correspond 
to the ORF13 gene product, which would be a 78 kDa pro¬ 
tein after cleavage of its signal sequence. The 61 kDa protein 
appears to be a peripheral membrane protein, perhaps corre¬ 
sponding to the ORF1 gene product, which would be a 
67 kDa protein. The 19 kDa protein is the major L2 protein 
and does not appear to be a membrane protein. It has 
been proposed to be a DNA-binding protein involved in 
nucleoprotein core assembly and may correspond to the 
ORF12 gene product, which would be a 17 kDa protein 
with extensive helix-turn-helix motifs. 

Growth and Replication 

L2 adsorption to A. laidlawii cells follows pseudo-first- 
order kinetics with an adsorption rate constant, depending 
on the medium being used, that is 10-100 times less than 
the theoretical value calculated for single-hit kinetics. 
Therefore, depending on the medium, only 0.1-10% of 
phage-cell collisions result in L2 adsorption. This may be 
due to a limiting number of L2 receptors or to a rate- 
limiting step in the interaction of virion and cell-membranes. 
Studies of the molecular nature of L2 receptors have 
produced conflicting data, so there are no conclusions 
on which types of cell-membrane macromolecules are 
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involved in L2 adsorption. It is assumed that adsorption 
leads to fusion of L2 phage and host cell membranes, result¬ 
ing in entry of the virion nucleoprotein condensation into 
the cell and then its uncoating. Note that recent studies indi¬ 
cate that Acholeplasma phage L2 can also be propagated on 
A. oculi strain ISM1499 (K. Dybvig, unpublished data) 
suggesting a co-occurrence of the phage L2 cell-surface 
receptor in A. laidlawi, phage L2’s normal host, and the 
A. oculi strain. 

L2 one-step growth curves have a 1-2 hour latent 
period followed by a gradual rise period that levels off at 
6-10 hours post-infection. Artificial lysis experiments 
show that the latent and eclipse period are the same, 
indicating that L2 assembly and release are coupled 
and that L2 infection is noncytocidal. By 10 hours post¬ 
infection, 100-1000 progeny phage have been released per 
infected cell, depending on the host cell strain being used. 
Since there is no measurable loss in cell titer in L2-infected 
cultures at an MOI up to 30 and since maturation by budd¬ 
ing from infected host cell membranes is shown by electron 
microscopy (hence the similarity of virion and cell fatty 
acid composition reported above), most if not all infected 
cells must be noncytocidally infected. 

Pulse-labeling and restriction endonuclease fragment 
analysis have shown that L2 DNA replicates bidirection¬ 
ally from two ori sites. These sites have been mapped and 
sequence analysis has located each ori site in an intergenic 
region of the L2 genome. Each putative ori site contains 
a Dna-A box bounded by AT-rich 6-mer repeats. Intracel¬ 
lular L2 DNA replication is membrane-associated and 
appears to involve a host DNA polymerase similar to Gram¬ 
positive bacterial DNA polymerase III. Continuous- and 
pulse-labeling studies also show that L2 DNA replication 
in infected cells continues throughout the growth-curve 
rise period. L2 DNA replication and progeny virion 
maturation continue for several hours after integration of a 
phage genome into the host cell genome. Although L2 
progeny DNA replication stops about 5-6 hours post¬ 
infection, cytoplasmic progeny L2 DNA persists up to at 
least 10 hours post-infection. 

Lysogeny 

Phage L2 has a unique type of infection cycle, with pro¬ 
ductive noncytocidal replication (above) that is followed 
by lysogeny in most if not all L2-infected cells. About 2-4 
hours post-infection, L2 infection of A. laidlawii leads to 
establishment of lysogeny with the circular L2 genome 
integrated at a unique site in the host cell chromosome. 
Based on extensive similarity to the attP site of phage X, 
the L2 attP integration site was mapped to 280 bp down¬ 
stream from the putative L2 integrase gene (ORF5), within 
a 600 bp intergenic region. This is similar to the situation 
in other temperate phages with lysogeny involving site- 
specific integrases, in which attP sites are downstream 


of integrase genes (see chapter 7 for a review of phage inte¬ 
gration). The L2 attP site is a 25 bp sequence with a 9 bp 
inverted repeat: CATCTTCAT-7 bp-CTGAAGATA. 

L2 lysogens are immune to superinfection by homolo¬ 
gous, but not by heterologous, phages. Two models of 
immunity exist, both examples of lysogenic conversion: 
(i) expression of an L2 repressor which inhibits intracel¬ 
lular L2 gene expression (and thereby productive infec¬ 
tion) and (ii) modification of the cell surface, thereby 
affecting phage adsorption. Since L2 lysogens also are 
immune to superinfection by transfecting L2 DNA, immu¬ 
nity seems at least due to an L2 repressor. By contrast, 
phage ai lysogens appear to be immune at least due to 
modification of cell surface receptors. Consistent with this 
repression model of L2 immunity, lyosgens can be induced 
to produce L2 phages by mitomycin C and by UV treatment 
(see chapter 8). Though not necessarily leading to cell- 
surface-mediated immunity, during productive infection 
the osmotic stability of L2-infected cells increases, which 
is perhaps due to an increase in cell permeability during 
the period of L2 membrane synthesis, maturation, and 
budding. After the end of productive L2 infection and estab¬ 
lishment of lysogeny, osmotic fragility returns to that of 
uninfected cells. 

L2 has a noncytocidal productive infection cycle 
which is followed by a lysogenic cycle in most if not all L2- 
infected cells. The regulatory mechanism for a decision 
between productive and lysogenic cycles therefore must 
be different for L2 and the lambdoid phages. For the latter, 
the decision is between a lytic and a lysogenic cycle, which 
are mutually exclusive. Similarly, phage ai has a cytocidal 
productive infection cycle (above), so regulation of lyso¬ 
geny requires a choice between either productive infection 
or lysogeny. The L2 regulatory mechanism cannot be a 
simple temporal switch from a productive infection cycle to 
a lysogenic one because L2 integration takes place during 
the rise period of the productive infection cycle, which 
continues for several hours after prophage formation. The 
nature of the L2 regulatory mechanism is not known. 


Insertion Mutants and Miniphages 

Two spontaneous L2 insertion mutants have been iso¬ 
lated from wild-type L2 stocks. Each has a genome of about 
15 kbp, consisting of the 12.0 kbp L2 genome plus a 3.3 kbp 
insertion. The 3.3 kbp fragment is the result of transposi¬ 
tion of two noncontiguous regions of the L2 genome, and 
the two L2 insertion mutants differ only in the location of 
the 3.3 kbp insert. During serial passage of each L2 insertion 
mutant, miniphage DNAs are formed. These are superheli¬ 
cal circular molecules of the 3.3 kbp insertion DNA and 
concatemers of the 3.3 kbp DNA. These miniphage DNAs 
can be packaged during L2 infection and produce particles 
that are defective and, perhaps, interfering. 
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Unclassified Phages 

There are a number of reports of uncharacterized myco¬ 
plasma phages. Most of these have been reviewed previously 
(23, 24) and will not be discussed in detail here because 
no new information is available. Examples include phage 
Ml of Acholeplasma modicum (8), phage 01 of A. oculi (30), 
phage Brl of Mycoplasma bovirhinis (17), and phage Hrl of 
Mycoplasma hyorhinis (18). Many of the unclassified myco¬ 
plasma phages are thought to have DNA genomes based 
on their tailed-phage morphology, but in most cases no 
data are available on the nucleic acid content. In addition to 
these unclassified viruses, there are a numerous reports 
of virus-like particles in electron micrographs of myco¬ 
plasma cultures (27). Also, there are anecdotal stories of 
extrachromosomal elements and a few reports of sponta¬ 
neous plaques on lawns of some mycoplasma species. 
These observations suggest that mycoplasma phages may 
be more common than the sporadic reports of their isolation 
might indicate. Of the unclassified mycoplasma phages, 
significant data are only available for Mycoplasma phage 
MAV1 and the Acholeplasma phage L172. 

Mycoplasma Phage MAV1 

History and Importance 

Phage MAV1 infects M. arthritidis, a mycoplasma that 
causes chronic arthritis in murine animals (31,40). Initially, 
a 16 kbp extrachromosomal dsDNA element was identified 
in M. arthritidis (strains 158plO, 14124, and 14152) that was 
suspected of being of phage origin because of its linear 
restriction map (49). Subsequently, Mycoplasma phage 
MAV1 was plaque-isolated from one of these M. arthritidis 
strains (158plO) using M. arthritidis strain 158 as the indica¬ 
tor strain. As a lysogenic phage, MAV1 is uniquely important 
for studies of Mycoplasma pathogenicity because MAV1 
lysogens are more virulent than nonlysogens (45, 51) (see 
chapter 47 for a general overview of the phage impact on 
bacterial pathogenicity). 

Virion and Macromolecules 

The morphology of MAV1 is not known and the taxonomy 
of Mycoplasma phage MAV1 has therefore not been deter¬ 
mined. MAV1 is resistant to treatment with DNase I, RNase 
A, chloroform, the nonionic detergent Triton X-100, trypsin, 
and proteinase K (49). Although many phages are relati¬ 
vely resistant to proteases, the resistance of a phage to a 
general protease, such as proteinase K, is unusual. However, 
Spiroplasma phage SpV4 (reviewed above) is also reported 
to be proteinase K resistant (33). Protease resistance may 
have evolved to adapt these phages to their host cells. 
Mycoplasmas have small genomes and lack many biosyn¬ 
thetic pathways, particularly for amino acids, purines, and 


pyrimidines. Hence, they must acquire these compounds 
from the environment by the production and secretion 
of scavenging proteases and nucleases. We postulate, 
therefore, that MAV1 has evolved resistance to a general 
protease secreted by M. arthritidis that has not been 
identified yet. 

MAV1 has been found to have a linear dsDNA genome 
of 15,644 bp with 29.0% G-C content (50). The degrada¬ 
tion of MAV1 DNA by X exonuclease indicates that the 
5'-terminus of the DNA strands is exposed and not attached 
to terminal protein (49). With one exception, all MAV1 
ORFs are on the same DNA strand (designated the 
(+) strand). Amino acid sequences of most of the predicted 
MAV1 proteins have no significant matches with proteins 
in sequence databases. Exceptions are the MAV1 ORFs 
that are predicted to encode a DNA replicase, an integrase, 
and a cytosine-specific restriction-modification system. 
Another ORF (the imm gene) would encode a protein with 
motifs characteristic of phage repressor proteins and may 
be the immunity protein that renders MAV1 lysogens 
resistant to superinfection. The lone ORF on the (—) strand 
is predicted to encode a lipoprotein designated Vir. 

An interesting aspect of the MAV1 genome sequence, 
like that of Acholeplasma phage L3 (above), is the complete 
absence of the sequence GATC, which would indicate 
that phage MAV1 has evolved to avoid GATC-specific 
restriction-modification systems. However, cell chromoso¬ 
mal DNA of the strains of M. arthritidis that have been exam¬ 
ined is readily degraded by the GATC-specific restriction 
enzymes, Mbol and Smi3AI (49). It has been suggested 
that the MAV1 hosts that have been studied lack GATC- 
specific restriction systems but that other hosts that have 
not yet been identified may have such systems. Alternatively, 
rather than an unknown host having a GATC-specific 
restriction system, the cytosine-specific restriction system 
which is predicted to be encoded by the MAV1 genome may 
recognize GATC. MAV1 lysogens have unmodified GATC 
sequences that are cleaved by Mbol and Sau3AI, but this 
would be expected because a hypothetical phage-encoded 
GATC-specific restriction system would most likely be 
expressed only during productive infection and not in lyso¬ 
gens. In support of this latter possibility, reverse trans¬ 
cription polymerase chain reaction (RT-PCR) analysis of 
total RNA isolated from MAV1 lysogens detected rnRNA 
transcripts corresponding to the imm and vir genes but 
not to other MAV1 genes, such as those encoding the 
restriction system (50). 

Growth and Replication 

MAV1 growth curves have not been done. One of the diffi¬ 
culties with studying MAV1 is that the phage has not 
been reproducibly propagated in broth. When M. arthritidis 
cells that are susceptible to MAV1 are mixed with the phage 
in broth, essentially 100% of the cells become lysogens 
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with the production of few phage particles, even when 
the multiplicity of infection is low. In contrast to M. arthri¬ 
tidis in broth media, MAV1 is readily propagated as plaques 
on lawns in agar. Stocks of the phage can be prepared 
with titers of 10 8 PFU/rnl by scraping top agar overlays 
from lawns of host cells that have undergone confluent 
lysis (49). 


Lysogeny 

Based on Southern analysis using probes specific for MAV1 
DNA, about half of the M. arthritidis strains that have 
been examined are MAV1 lysogens (51). Strains containing 
MAV1 DNA sequences were resistant to infection with 
MAV1. Most but not all of the nonlysogenic strains were 
found to be MAV1 hosts. In some cases, M. arthritidis strains 
that had been maintained in different laboratories differed 
in whether they were MAV1 lysogens. We suspect that 
M. arthritidis strains have on occasion been unknowingly 
infected with MAV1, generating strain differences with 
regard to lysogeny among laboratories. 

During lysogeny, MAV1 DNA inserts into the genome of 
M. arthritidis at the site TATTTTT (49). This 7 bp sequence 
is common in the AT-rich genome of M. arthritidis and 
is present at 42 sites in the MAV1 genome. MAV1 DNA 
can therefore integrate at numerous sites in the host chro¬ 
mosome during the establishment of lysogeny. 

The mechanism by which MAV1 DNA integrates into 
the M. arthritidis chromosome is not known. MAV1 DNA 
probably forms a circular integration intermediate, but 
this has not been proven. PCR experiments using primers 
oriented outward from the ends of the linear MAV1 genome 
yield products consistent with some template molecules 
being circular, but the products also could arise from 
template molecules that are head-to-tail dimers (49). Restric¬ 
tion mapping data show that the termini of linear MAV1 
genomic DNA isolated from virions are the same as the 
termini of MAV1 prophages in lysogens. The ends of the 
MAV1 genome when circularized form the 7 bp TATTTTT 
sequence. Thus, it appears that MAV1 DNA inserts into 
the chromosome by a site-specific recombination mechan¬ 
isms at the TATTTTT sequence. 

Some features of MAV1 DNA integration resemble 
the insertion of transposon Tn916 during transposition. 
Sequences of the extreme ends of MAV1 prophages are not 
identical in all lysogens (49). Single nucleotide differences 
at the MAV1 DNA ends could be interpreted as evidence 
of an integration mechanism involving a staggered nucleo¬ 
tide cleavage. A circular intermediate during Tn926 
transposition also has the sequence TATTTTT at the recom¬ 
bination site, and a variable sequence (sometimes referred 
to as a coupling sequence) at the ends of the trans¬ 
poson varies as a result of the staggered strand exchange 
mechanism (5). 


Impact on Host Virulence 

The association between MAV1 and M. arthritidis viru¬ 
lence is clear, but the underlying mechanism requires 
further study. Preliminary data indicate that the myco¬ 
plasma load, assayed as CFU, is greater in animals infected 
with MAV1 lysogens than in animals infected with non- 
lysogens (A.-H. T. Tu and K. Dybvig, unpublished data). 
The difference in mycoplasma load is apparent as little 
as 3 hours post-inoculation. Therefore, MAV1 lysogens 
may resist innate host defenses more effectively than 
nonlysogens. 

Because MAV1 lysogens are always more virulent 
than nonlysogens (51), regardless of the particular TATTTTT 
site into which MAV1 DNA has integrated, it has been 
proposed that MAV1 encodes a virulence factor (50). The 
MAV1 Vir protein is one of only a few examples of lipo¬ 
proteins encoded by phage genomes. MAV1 virions are resis¬ 
tant to treatment with nonionic detergents and chloroform 
and, therefore, lack lipid. Thus, it is assumed that Vir is 
anchored in the cell membrane of MAVl-lysogenized 
M. arthritidis where it may interact with M. arthritidis- 
infected organisms. Vir may function similarly to the 
lipoprotein of bacteriophage X, Bor, which is involved in 
serum resistance (1). 

Acholeplasma Phage LI72 

History and Importance 

Acholeplasma phage L172 was isolated from washes of 
A. laidlawii lawns and, based on morphological observa¬ 
tions, was believed to be related to Acholeplasma phage L2, 
an enveloped, quasi-spherical phage containing circular 
dsDNA (above). This led to erroneous interpretations of 
subsequent biochemical and biophysical studies “con¬ 
firming" L172 to be a circular dsDNA phage. Later investi¬ 
gators, noting that L172 DNA was sensitive to various 
nucleases not expected to hydrolyze for dsDNA, showed 
that L172 contains circular ssDNA; the original studies 
apparently had been done using virion ssDNA contaminated 
with RF dsDNA. 

Virion and Macromolecules 

L172 virions are enveloped quasi-spherical particles, 
60-80 nm in diameter. About 10-12% of L172 dry weight 
is lipid, consisting of phospholipids, glycolipids, and 
phosphoglycolipids. L172 membrane lipid and fatty acid 
compositions are similar to those of the A. laidlawii 
membrane. The virion membrane is 6.5-8.0 nm, but no 
internal structure has been observed. Due to the limited 
data available, the taxonomy of Acholeplasma phage L172 
has not been established, although L172 clearly repre¬ 
sents a new phage family: enveloped, quasi-spherical 
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phages containing ssDNA (23). L172 is inactivated by the 
nonionic detergents octal glucoside and Triton X-100, and 
by ether and it is also sensitive to heat inactivation. 

The L172 genome is 14 kb ssDNA with a G-C content 
of 29.4%, significantly less than the 31.8% G-C content of 
A. laidlawii host cells and the 32.0% G-C content of Achole- 
plasma phage L2. L172 DNA is sensitive to SI nuclease 
and resistant to exonuclease VII. L172 virions, analyzed by 
SDS-PAGE, contain seven proteins of 71, 68, 53, 42, 40, 18, 
and 15 kDa. 

Growth and Replication 

The limited data available indicate that replication of 
L172 ssDNA in A. laidlawii host cells is similar to that of 
L51 ssDNA: L172 RF dsDNA has been isolated from infected 
cells and L172 cannot infect A. laidlawii REP~ cells, 
which are mutants that cannot be infected by ssDNA Achole- 
plasma phages but can be infected by dsDNA Acholeplasma 
phages. 

Host Restriction and Modification 

Many restriction-modification systems have been reported 
in mycoplasmas. Previously reviewed restriction systems 
include the CAATTG-specific enzyme of Mycoplasma 
fermentans, a GATC-specific system in A. laidlawii, another 
A. laidlawii system that restricts DNA containing 5-methyl- 
cytosine regardless of sequence context, a GCGC-specific 
system in S. citri, and a GCNGC-specific system in Urea- 
plasma urealyticum (13, 26). Some mycoplasmas have poten¬ 
tial restriction-modification systems based on analysis of 
complete genome sequences, but phenotypic data are 
lacking (15, 20). 

Mar/ Restriction System 

Many M. arthritidis strains have a restriction system that 
recognizes AGCT. This system is designated Marl in the 
REBASE database (http://rebase.neb.com/rebase/rebase. 
html) (48). Marl is an isoschizomer of the Alul restric¬ 
tion enzyme, and M. arthritidis genomic DNA is resistant to 
cleavage byA/id. Initial attempts to genetically transform M. 
arthritidis strains that possess Marl were unsuccess¬ 
ful. Transformation was achieved only when the transform¬ 
ing DNA had been modified in vitro with Alul DNA 
methyltransferase (MTase) (48). Thus, Marl is a significant 
barrier to gene transfer in M. arthritidis. Genomic DNA of 
all Mycoplasma phage MAV1 host strains that have been 
examined is resistant to Alul digestion, indicating Marl 
is widespread in M. arthritidis. Phage MAV1 DNA has 
four AGCT sites and would be restricted if it were not modi¬ 
fied by the Marl MTase. However, because the known 
MAV1 hosts all possess Marl, all MAV1 phage stocks are 


modified at AGCT sites and, therefore, not restricted during 
infection. 

Phase-Variable Restriction Systems 

M. pulmonis hsd Loci 

M. pulmonis has complex DNA inversion systems that 
encode type I restriction and modification enzymes. Type I 
enzymes contain two HsdS subunits that determine the 
DNA recognition sequence: HsdM subunits that contain 
the catalytic domain for the MTase reaction and an HsdR 
subunit with the catalytic domain for DNA cleavage. 
Although both initial binding of the enzyme to duplex 
DNA and the MTase reaction are site-specific, the DNA clea¬ 
vage reaction occurs at essentially random sites due to the 
enzymes ATP-dependent DNA translocation activity. Each 
M. pulmonis hsd locus has two hsdS genes, one hsdR gene, 
and one hsdM gene (14). The two hsdS genes flank hsdR 
and hsdM and are in an inverted orientation to one another. 
DNA inversions occur at high frequency between the two 
hsdS genes (12, 14, 43). The coding region of each hsdS 
gene has two or three recombination sites for DNA inver¬ 
sions. Thus, the primary amino acid sequences of the 
HsdS proteins vary according to the particular DNA inver¬ 
sion that occurs. These changes in HsdS sequence alter the 
specificity of the DNA recognition sequence of the type I 
enzyme, as assessed by the plaquing efficiency of Myco¬ 
plasma phage PI (12). Therefore, DNA inversions between 
hsdS sequences generate a family of M. pulmonis restriction 
and modification enzymes with differing specificities. 

The hsd DNA inversions not only change the specifi¬ 
cities of HsdS subunits but also result in phase-variable 
expression of restriction and modification activity (12, 14). 
One of the hsdS genes along with hsdR and hsdM are tran¬ 
scribed as an operon (43). There is a single hsd promoter 
at one end of the hsd locus that drives transcription of 
the hsdS, hsdR, and hsdM operon or from the second hsdS 
gene, depending on the orientation of the locus in the chro¬ 
mosome. Thus, one of the hsdS genes is always trans¬ 
cribed and expression of hsdR and hsdM is phase-variable. 
In the absence of hsdR and hsdM transcription, cells lack 
restriction and modification activity. M. pulmonis strain 
KD735 has two hsd loci: hsdl and hsd2 (43). If either locus 
is oriented such that hsdR and hsdM are transcribed, two 
restriction and modification enzymes are produced (12). 
One enzyme has the HsdS subunit encoded by the tran¬ 
scribed hsdl gene, and the other enzymes HsdS subunit 
is encoded by the transcribed hsd2 gene. Cells that lack 
restriction and modification activity only arise when 
both loci are oriented such that hsdR and hsdM are not 
transcribed. 

The complete genome sequence of M. pulmonis strain 
CT reveals a third hsd locus, hsd3, not identified in strain 
KD735 (6). The hsdR and hsdM genes of hsd3 are most 
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likely defective due to frameshift mutations, but the hsdS 
genes appear functional. It is possible that three restriction 
and modification enzymes would be produced in the CT 
strain when hsdR and hsdM are transcribed because an 
hsdS gene could be transcribed from each of the three 
hsd loci. 

The M. pulmonis genome sequence (http://genolist. 
pasteur.fr/MypuList/) has a gene (MYPU 5310) near the 
vsa locus that is predicted to encode a site-specific DNA 
recombinase. The vsa locus encodes a family of surface lipo¬ 
proteins (V-l antigens) that vary due to site-specific DNA 
inversions within the vsa locus (3, 39, 41). Recently, an 
M. pulmonis mutant has been isolated from a transposon 
library in which the recombinase gene is disrupted (42). 
The mutant fails to undergo DNA inversions at both vsa 
and hsd loci. E. coli cells containing the cloned recombinase 
gene, but not E. coli that lack the recombinase gene, can 
undergo site-specific DNA inversions between both vsa 
and hsd sequences (42). Thus, it appears that a single site- 
specific DNA recombinase catalyzes both vsa and hsd 
inversions, indicating that antigenic variation and restric¬ 
tion enzyme variation is linked. Such linkage may explain, 
at least in part, why a greater percentage of M. pulmonis 
organisms isolated from the respiratory tract of infected 
animals have active restriction systems than do organisms 
that have been maintained in laboratory media (19). 

Other Putative Phase-Variable Enzymes in 

M. pulmonis 

Several putative M. pulmonis genes, or partial genes, are 
predicted to encode type III DNA MTase enzymes. These 
are designated in the genome sequence as MYPU 3950, 
3960, 3970, 3980, and 4800. Other genes that are pre¬ 
dicted to encode cytosine-specific MTase enzymes are 
MYPU 0430, 0440, 1850, and 1860. MYPU 4720 and 6880 
are predicted to encode adenine-specific MTase enzymes. 

Many of these genes have tandem dinucleotide repeats 
within the coding region. The number of tandem repeats 
would likely vary as a result of slipped-strand mispair- 
ing during DNA replication (22). MYPU 1850, 3960, 3970, 
and 3980 have 9, 8, 12 and 7 tandem AG repeats, respec¬ 
tively; MYPU 4800 has 13 tandem GA repeats; and MYPU 
0440 has 14 tandem CA repeats. For some of these genes, 
the ORF would be extended if the number of tandem repeats 
increases or decreases by one. MYPU 3960 and 3970, for 
example, would be extended 231 and 237 nucleotides, 
respectively, if the number of tandem AG repeats increases 
by one. If the 9 AG repeats in MYPU 1850 increased to 10, 
then the resulting frameshift would merge MYPU 1850 
(predicted gene product of 82 amino acids) and 1860 (pre¬ 
dicted gene product of 264 amino acids) into a single 
ORF encoding a functional MTase of 432 amino acids. Simi¬ 
larly, the frameshift resulting from a change in the number 
of CA repeats in MYPU 0440 from 14 to 15 would merge 


MYPU 0440 and MYPU 0430 into a single gene that 
would most likely encode a functional MTase of 406 amino 
acids. Accordingly, it is predicted that M. pulmonis posses¬ 
ses several MTase enzymes, some of which would be phase- 
variable due to slipped-strand mispairing. 

The predicted proteins that would result from merg¬ 
ing MYPU 1850 and 1860 and from merging MYPU 1430 
and 1440 by slipped-strand mispairing share signifi¬ 
cant amino acid sequence similarity with one another and 
with numerous cytosine-specific MTase enzymes from 
other organisms including some mycoplasmas. Examples 
are putative enzymes from M. mycoides subsp. capri (Gen- 
Bank Accession No. AF072715) and 17. urealyticum (15). 
Another example is the Sssl MTase from Spiroplasma sp. 
strain MQ1 that methylates cytosine in the sequence CG (34). 

A Possible Mycoplasma hominis 

Phase-Variable Mtase 

The cytosine-specific MTase that is predicted to be 
encoded by the Mycoplasma phage MAV1 genome is similar 
to the predicted gene product from a partial ORF found 
upstream of the M. hominis dnaK gene (GenBank Accession 
No. AJ132792) (K. Dybvig, unpublished data). In addition 
to predicted amino acid sequence similarity, the MAV1 and 
M. hominis genes share extensive nucleotide similarity. 
M. hominis and M. arthritidis, the phage MAV1 host, are 
phylogenetically related and it is possible that MAVl-related 
phages are responsible for the spread of restriction and 
modification genes among Mycoplasma species through 
transduction. The M. hominis MTase ORF lacks a convinc¬ 
ing ribosome binding site and a translation start site. 
However, near the 5'-end of the coding region are 10 
tandem copies of the dinucleotide AG. If the number of 
AG repeats changes to 11 by slipped-strand mispairing, 
a ribosome binding site and translation start site would 
emerge. Thus, production of the M. hominis MTase may 
be phase-variable. 

Functional Significance of Phase-Variable 

Restriction and Modification Enzymes 

The purpose of phase-variable restriction systems is 
unknown. In M. pulmonis laboratory-adapted strains, 
nearly all cells in a culture have hsd loci oriented such 
that hsdR and hsdM are not transcribed. Thus, the majority 
of cells in culture are readily susceptible to phage PI. 
However, the primary function of the hsd loci may not be to 
protect cells from phage infection. This is especially true 
for the various other phase-variable MTase enzymes that 
M. pulmonis is predicted to produce. No cognate restric¬ 
tion endonucleases for these MTase enzymes were identi¬ 
fied from the complete genome sequence. Finally, it is not 
known how genomic DNA escapes endonucleolytic 
attack when phase-variable restriction activity is induced 
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by hsd inversion. Possible answers to some of these ques¬ 
tions have been proposed but are not altogether satisfying 
(9,19, 21,43). 

Conclusions 

The unusual phenotypes of mycoplasma phages may 
reflect the rapid rate of evolution that has characterized the 
degenerate evolution of mycoplasmas from Gram-positive 
bacteria (25). This may have allowed mycoplasma cells and 
phages to explore evolutionary alternatives not accessible 
to other biological systems. In some cases mycoplasma 
phages seem to have evolved new infection strategies, like 
the distinctive type of temperate infection of Acholeplasma 
phage L2 (productive infection and lysogeny in the same 
infected cell). In other cases mycoplasma phages seem to 
be phages that have adapted from growth in eubacteria to 
growth in mycoplasmas, like the short-tailed phages 
that infect Acholeplasma, Spiroplasma, and Mycoplasma 
species. For these phages, there remain unanswered 
questions on the role of phage tails in infections involving 
wall-less mycoplasma host cells and the mechanism of 
nonlytic release of progeny tailed phages from infected cells. 

An unexpected similarity has been reported in the 
capsid protein structure of the small icosahedral ssDNA 
phages that infect Spiroplasma, Bdellovibrio, and Chla¬ 
mydia (4). Spiroplasma are evolutionarily degenerate Gram¬ 
positive bacteria without cell walls. However, Bdellovibrio 
and Chlamydia are Gram-negative bacteria, each with 
distinctive morphological, growth, and biochemical charac¬ 
teristics, and phylogenetically distant from each other and 
from Spiroplasma. If the similarity in the capsid proteins 
of the Microviridae phages that infect these three genera 
is an example of convergent evolution, this would imply 
similar selective pressures presented by the different types 
of cell surfaces of these phylogenetically very distant host 
genera. 

There are now a couple of examples of mycoplasma 
phages affecting host pathogenicity. The increased virulence 
observed when M. arthritidis is a phage MAV1 lysogen 
provides a new approach for investigating poorly under¬ 
stood mycoplasma diseases. Also, preliminary data on 
the effect of infections with heterogeneous SRO phages in 
eliminating SRO spiroplasmas causing abnormal sex ratios 
in Drosophila may be an example of phage therapy (phage 
therapy is reviewed in chapter 48). 

The phylogeny of the three mycoplasma genera that 
are known phage hosts shows that Acholeplasma arose from 
the Streptococcus phylogenetic branch, Spiroplasma arose 
from the Acholeplasma phylogenetic branch about the time 
of the first land plants and insects, and Mycoplasma arose 
from the Spiroplasma phylogenetic branch about the time 
of the first mammals. Questions about the origin and evolu¬ 
tion of mycoplasma phages can be phrased in terms of 


this host phylogeny. Did mycoplasma phages arise from 
phages of Gram-positive eubacteria and coevolve with Acho¬ 
leplasma species as they diverged from the Streptococcus 
phylogenetic branch, then diverge later and coevolve 
with Spiroplasma species, and finally diverge again and 
coevolve with Mycoplasma species? In this model, did myco¬ 
plasma phages play a role in host adaptation to growth in 
new ecological niches during evolution of the biosphere? 
The requirement that Spiroplasma and Mycoplasma phages 
use UGA as a tryptophan codon argues against a recent 
origin of these phages and suggests they must have arisen 
along with Spiroplasma and coevolved the necessary codon 
changes with these host cells. 
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Lactobacillus Phages 
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T he studies conducted with Lactobacillus bacterio¬ 
phages reflect the economic and medical importance 
of their hosts. Due to the variety of food fermentation lacto- 
bacilli-based processes that can become disrupted by phage 
development, and the diversity of mucosal surfaces colo¬ 
nized by lactobacilli, phage research has not concentrated 
on a single Lactobacillus species or phage. Lactobacillus 
phage research also is frequently directed by practical 
needs. For example, the industrially relevant properties of 
lactic acid bacteria are often plasmid-encoded. These plas¬ 
mids are intrinsically unstable, which frequently leads to 
strain degeneration. The DNA-integration mechanism used 
by temperate Lactobacillus phages therefore has been 
studied to develop tools for chromosomal stabilization of 
economically important traits (57, 73). In addition, many 
fermented products undergo a ripening period that may 
last several months. Early lysis of the starter bacteria might 
accelerate ripening through release of the intracellular 
enzymes into the food matrix. To this aim, the expression 
of cloned phage lysis cassettes, controlled by inducible 
promoters, has been studied (19). Furthermore, the need 
of phage-insensitive Lactobacillus- based fermentation 
starters has promoted the generation of strains harbour¬ 
ing a cl-like repressor gene or a phage replication origin 
(3,59). 


Phage Types 

All types of tailed phages ( Caudovirales ) have been iso¬ 
lated from lactobacilli. The phages with long, noncontrac- 
tile tails ( Siphoviridae ) come in two classes: phages with 
isometric heads represent the majority of the isolates 
(figure 41-1A) and are relatively well documented by com¬ 
plete genome sequences (1, 46, 58, 66). In contrast, phages 
with prolate heads and peculiar knob-like appendages 
along the tail (e.g., the group c Lactobacillus delbrueckii 


phage JCL1032; 26) have not yet been characterized in any 
molecular detail. Molecular data on Lactobacillus phages 
with contractile tails (Myoviridae) and more complicated 
tail appendages (figure 41-1B) are starting to accumulate. 
Examples of Podoviridae Lactobacillus phages have not yet 
been investigated. See chapter 2 for a general overview of 
phage classification. 

Myoviridae 

Myoviridae have been isolated from Lactobacillus casei (37) 
and Lactobacillus plantarum. They come in two genome size 
classes: the smallest are 40-55 kb in size and the biggest 
about 130 kb. Among the latter, L. plantarum phage ®LP65 
was isolated from a meat fermentation and sequenced 
(16b). The genome showed about 160 open reading frames 
(or/s). N-terminal sequencing and mass spectrometric 
analysis of the major structural proteins from the phage 
<DLP65 virion allowed the identification of the structural 
module on its genome map. 

The gene map of the structural module was identical 
to that of the myovirus A 511 that infects Listeria, a phy¬ 
logenetic relative of Lactobacillus (see chapter 37). The 
two phages shared sequence identity at the protein level 
(25-56% amino acid identities) but not at the DNA 
sequence level. The overall genetic organization of this 
module (terminase, portal, major head, major tail, tail tape 
measure, side tail fiber, lysis genes) still resembled that 
of Siphoviridae from the same group of bacteria (see below). 
DNA sequence similarity was, however, only detected for 
the tape measure, lysin, and a few putative transcriptional 
regulation genes from Lactobacillus siphoviruses. The regu¬ 
lation genes were located at both genome ends of phage 
<DLP65. Downstream of the 4>Lp65 lysin lies a cluster of 
13 tRNA genes followed by a DNA replication module, 
which was identified by bioinformatic analysis. The matches 
included a DNA polymerase related to that of Bacillus sub- 
tilis phage SP01 and endo- and exonuclease, helicase and 
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Figure 41-1 Electron micrographs of L case/ siphovirus A2 
and L. plantarum myovirus <t>LP65. Head edge lengths: 

60 ± 3 nm. Tail dimensions of A2:280 nm long, 12 nm wide. 
Tail of <PLP65: non-contracted, 193±8nm; contracted, 

115 ± 5 nm. 

primase genes whose best matches were with T4-like 
Myoviridae from Escherichia coli (phage T4 is reviewed in 
chapter 18). However, more than 50% of the ORFs lacked 
database matches. 

Siphoviridae 

The isometric-head phages from lactobacilli show a fami¬ 
liar genome organization and belong to one of two major 
types of temperate phages isolated from a wide range of 
low G-C content Gram-positive bacteria: Sfi21-like cos-site 
Siphoviridae and 4>Sfill-Iike pcrc-site Siphoviridae (12). Both 
basic phage types showed distant relatedness with lamb- 
doid coliphages suggesting a X supergroup of phages. Sfi21- 
Iike phages shared the organization of the structural genes 
with E. coli phage HK97 (20, 33) while Sfill-like phages 
showed weak sequence similarity to head genes from E. coli 
phage X (22, 54) (see chapter 27 for a review of lambdoid 
phages and phage X). 

Cos-site Siphoviridae: 4>adh and A2 

Two cos-site Sfi21-like Lactobacillus phages have been 
sequenced: Lactobacillus gasseri phage (j>adh (1) and L. casei 
phage A2 (30, 66). Numerous protein sequence similarities 
linked these two Lactobacillus phages and the Streptococcus 
thermophilus phage Sfi21. The similarity with streptococcal 
phages was especially marked for the DNA packaging 
and head and tail genes of phage (j)adh (21) and the DNA 
replication genes of phage A2 (59) (figure 41-2A). More 
distant relationships were detected with other Sfi21-like 
phages. In order of decreasing relatedness these were Lac- 
tococcus lactis phage BK5-T, Staphylococcus aureus phage 
PVL, B. subtilis phage 4*105 and Clostridium perfringens 


phage (j)3626. This series reflects the order of phylogenetic 
relatedness of the bacterial hosts, but models of coevolu¬ 
tion of phages with their hosts have been questioned by 
recent data from lactococci (66). The overall genome size of 
the cos-site Lactobacillus phages (43 kb and about 60 or/s) 
and their modular organization is comparable to that of 
the pac-site Lactobacillus phages. However, a few differences 
are notable such as the consistent finding of the genetic 
linkage of the DNA packaging type and the constellation 
of the head morphogenesis genes. 

The head gene cluster (small subunit terminase, large 
subunit terminase, portal protein, protease, major head 
protein) is characteristic for Sfi21-like Siphoviridae (20). 
These phages frequently showed proteolytic processing of 
the major head protein during capsid assembly. A potential 
maturation protease of the ClpP family was identified in 
the gene preceding the major head gene (29: B. Henrich, 
personal communication). During head maturation an 
N-terminal peptide of about 120 amino acids (predicting 
a conspicuous coiled-coil structure) is cleaved off. The 
released protein is supposed to serve as a scaffold protein 
for head morphogenesis as demonstrated in the much 
better investigated E. coli phage HK97 (33). In fact, Lactoba¬ 
cillus phage (j)adh shared not only a related gene map, but 
even sequence similarity with several genes from the head 
gene module of Pseudomonas phage D3 (32), suggesting an 
even wider distribution of the Sfi21-like siphoviruses (22). 

In other respects phages 4>adh and A2 differ from the 
standard genome map of Sfi21-like phages. They contain 
many very small or/s around (<f>adh) or downstream (A2) 
of the DNA replication module, mostly without database 
matches. The presence of endonuclease-like genes supports 
the suspicion that selfish DNA elements might be found 
in these undefined gene clusters as was previously observed 
in streptococcal (25) and lactococcal phages. In fact, in 
phage A2 most of this region could be deleted without 
any phenotypic effects, at least under laboratory condi¬ 
tions (49). 

Pac-site Siphoviridae: cj)gle, LL-H 

The Lactobacillus phages 4>gle and LL-H rely on pac sites 
for headful DNA packaging that results in variable genome 
segments of somewhat more than unit length. Their head 
gene cluster differs from that of the cos-site phages by encod¬ 
ing an additional head protein and a separate scaffold 
protein. 

The overall genomic organization of the temperate 
L. plantarum phage <f>gle (46) (figure 41-2B) was identical 
to that of pac-site Streptococcus thermophilus phage 01205, 
L. lactis phage TP901-1, B. subtilis phage SPP1 (9), Listeria 
monocytogenes phage A118 (51) and Staphylococcus aureus 
prophages ETA, all members of the proposed genus of 
4>Sfill-like pac-site Siphoviridae. Frequently, even the gene 
order within an individual module was remarkably well 
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conserved across species and genus barriers (lysogeny 
module, DNA replication module, structural genes). L. 
delbrueckii phage LL-H and L. plantarum phage 4>gle shared 
related DNA packaging and head and tail morphogenesis 
modules. However, the sequence similarity was restricted 
to the protein level (suggesting that they are only distant 
relatives). Notably, the sequence similarity did not include 
the major head gene (39), demonstrating that individual 
genes and not entire modules are the units of genetic 
exchange between these Lactobacillus phages (22). 

In contrast, high levels of DNA sequence identity were 
demonstrated between the virulent and the temperate pac- 
site phages LL-H and mv4 that infect the same host species 
L. delbrueckii (76). Notably, the virulent phage LL-H still 
contained remnants of a phage integrase and a phage 
attachment site, demonstrating that it was recently derived 
from a temperate phage (58). Close genetic relationships 
between temperate and virulent phages were also described 
in other dairy bacteria (56). 

Analysis of Lactobacillus Phage Functions 

Receptors 

The pioneering work of Watanabe et al. (81-84) on the 
recognition and entry of phage PL-1 into its L. casei host 
indicated that the process can be divided into two steps: 
adsorption and injection. The first one is reversible, being 
the specific recognition between the phage and its host 
mediated by a cell wall polymer that contains rhamnose 
as a dominant residue. DNA injection is probably dependent 
on a membrane protein analogous to the phage infec¬ 
tion protein (PIP) used by the L. lactis prolate-head phages 
of the c2 quasi-species (31). On the phage side, a protein, 
probably located at the tip of the LL-H virion fiber (gp 71), 
has been found to recognize the receptors of its host, 
L. delbrueckii (68). A series of point mutations in the 
C-terminal part of gp71 widen the host range of the phage. 
Significantly, gp71 presented a significant degree of similar¬ 
ity to a protein from phage JCL1032, which infects the same 
host but is otherwise not related to LL-H. 

Lysogeny Module 

The determinants that encode the temperate phenotype, 
namely the genetic switch and the integration cassette, are 
clustered in all Siphoviridae able to lysogenize members 
of the low G-C branch of Gram-positive bacteria (55). The 
module is typically located between the lysis and replica¬ 
tion cassettes and the genes are divergently transcribed. 
To one side of the genetic switch, the first gene encodes the 
lysogenic cycle repressor (cl, following the X nomenclature), 
which is followed by two orfs of unknown function. The 
final gene to this side encodes the site-specific phage 


recombinase that mediates insertion of the phage DNA into 
the genome of its host. The recombinase is followed by the 
phage attachment (attP) region, where the recombination 
with the homologous attB site of the host chromosome 
takes place. In the vegetative phage map these genes oppose 
the convergently oriented lysis genes, being separated by 
a transcription termination loop. 

Genetic Switch 

To the other side of cl lies a cro-like repressor gene that 
may be followed by a putative antirepressor gene (reviewed 
for phage X in chapter 8). Between cl and cro there is an 
intergenic region that harbors the two P L and P R promoters 
and the dyad-symmetry operator sequences for Cl and 
Cro repressor binding (figure 41-3). The functionality of the 
Cl-like repressors has been proven for three phages — (j)adh, 
(j)gle and A2 — on the basis of their capacity to confer 
superinfection immunity to a previously susceptible host 
and by their ability to suppress transcription from the P L 
and P R promoters (24, 28, 46, 49). The Cl repressors from 
phages <j)gle and A2 have been purified and shown to bind 
in a cooperative manner their operator sequences. In both 
cases and as expected, the affinity of Cl was higher for the 
operator sequences located in the vicinity of the P R promo¬ 
ter. In addition, the Cl repressor of phage A2 reallocated the 
RNA polymerase toward P L , in spite of its much stronger 
affinity for P R in the absence of any regulatory protein, 
thus promoting transcription of cl and the entry of the 
phage into the lysogenic cycle. For both phages, as the levels 
of Cl were raised, all the operator sequences became occu¬ 
pied, resulting in autoregulation of Cl biosynthesis (28, 39). 
Conversely, the purified Cro-like repressors from phages 
(j)gle and A2 initially bound the operator sequences lying 
between the —35 and —10 hexamers of P L and, as their 
concentrations increased, they also attached to the P R over¬ 
lapping operators (38, 40, 47). In the case of phage A2 
this resulted in DNA-looping and displacement of the RNA 
polymerase from it (48). 

Both Cl and Cro repressors are dimers in solution 
and present DNA-binding helix-turn-helix motifs in their 
N-terminal moiety. Their C-terminal ends, by contrast, are 
responsible for dimerization. In the case of phage A2, the 
C-terminal end in addition enhances the affinity of Cro for 
the DNA (38,47). 

Regulation of Phage Life Cycles 

The general features of the lysis/lysogeny decision of 
Lactobacillus phages resemble those of the X genetic switch 
(35, 67), but their structural characteristics and regulation 
seem to be simpler. First, the synthesis of Cl is constitu¬ 
tive and probably coexists with the expression of the lytic 
operon from the start of infection (40, 49). The situation 
may be facilitated (i) by the long intergenic region between 
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open boxes. The transcription start sites of the divergently oriented promoters are signaled by bent arrows. The numbers 
within the double-headed arrows indicate the lengths (in base pairs) of the relevant features as follows (from top to bottom): 
distance between operator sequences, operators, segment from center to center of contiguous operators, stretch between 
the centers of the P L and P R promoters; note especially the difference in size of the segments located between the operator 
sites and the total length of the P L -P R intervening sequence. In the case of </>g1 e another operator sequence was detected 3' 
of each cpg and eng (which are the functional equivalents to phage X cl and cro respectively). 


cL and cro, where two RNA polymerases may coexist (as has 
been shown in vitro for phage A2; 28) and (ii) by the lack 
of overlap between P L and P R (in the case of phage 4>adh, 
the situation may be different because the two promoters 
overlap; 24). In order to direct a significant proportion of 
the infectious events toward the lytic route, the constitu¬ 
tive synthesis of Cl presumably has to be counterbalanced 
by an efficient production of Cro. This might explain why, in 
the case of phage A2, the transcripts arising from P R are at 
least 10 times more abundant than those generated from 
P L (28). In addition, the P R transcript, which covers cro, elon¬ 
gates further into the early region to comprise the repli¬ 
cation genes in the absence of any phage-encoded products 
(such as the X N antitermination protein) (59 and unpub¬ 
lished data). 

Integration 

The integration systems of several Lactobacillus phages, 
including 4>FSW of L. casei (74, 75), 4>adh of L. gasseri (69), 
mv4 of L. delbrueckii (23), 4>gle of L. plantarum (85) and A2 
of L. casei (2), have been functionally characterized and 
various food-grade integration vectors were constructed 


(57, 73). The five recombinases belong to the tyrosine 
integrase family (4, 62), ranging in size from 385 residues 
(phages c()adh and A2) to 427 residues (phage mv4). The 
prophages integrate into the 3' end of tRNA genes, except 
for phage (j)FSW in which integration occurs at the end 
of the glucose-6-phosphate isomerase gene. In all cases, 
their functions are conserved. Only Lj 965, a prophage of 
Lactobacillus johnsonii, disrupts a tRNA pro gene (curiously 
this prophage carries four tRNA genes but none is specific 
for proline) (79). 

The length of the common attP/attB core is between 
16 bp ((j>adh) and 40 bp (<f>FSW). The recognition require¬ 
ments of attB seem not to be very strict. Both the phage 
mv4 and phage A2 integrases promote integration of 
plasmids carrying their cognate attP DNA sequences into 
the genomes of a variety of Gram-positive bacteria of low 
G-C content and even into E. coli (2, 6, 7). Plasmid inser¬ 
tion takes place in positions homologous to their natural 
attB sequences in lactobacilli but it occurs in a variety of 
places, whether protein coding or intergenic, in other 
species. This variety of flttB-like sites allowed the definition 
of minimum sequences for the integration to occur. In the 
case of phage mv4, this was a 16 bp DNA segment, where 
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some nucleotides were critical for integration while others 
could be changed without loss of recombination efficiency 
(6). In phage A2 a conserved heptanucleotide, embedded 
in a 19 bp degenerated sequence, defined the attB site (2). 
All attP regions showed a high proportion of AT sequence 
and several direct and inverse repeats. These probably act 
as recognition sites for the integrases in order to generate 
the intasome complex that precedes synapsis formation 
with the attB site (71). The minimal size of attP has only 
been determined for phage mv4. It comprises a DNA 
segment slightly longer than 200 bp. The attP-core sequence 
is found at one of its ends (5). 

Replication 

Lactobacillus phage replication functions have not been 
investigated in great detail. No replication enzyme has been 
isolated and the only functional studies deal with the 
replication origins (ori) of phages c()adh and A2. The ori 
region, acting in cis, promotes plasmid replication in their 
natural hosts and, in the case of phage (j)adh, also in 
L. lactis (1). In phage A2, ori may act as a decoy for phage- 
specific proteins because, when present in multicopy, it 
partially inhibited phage replication. This resulted in a 
significantly lower phage progeny count and thus in partial 
resistance against infection. Moreover, a DNA fragment 
containing ori was retarded in gel mobility-shift experi¬ 
ments when incubated with extracts of infected cells (59). 

Morphogenesis 

Information on the processes leading to virion formation 
in lactobacilli mainly refers to phages 4>gle and A2. In 
phage <)>gle a major structural protein was purified and 
specific antibodies were raised that allowed its ascription 
to the tail of the virion (41). In phage A2, the virion 
presents three major proteins, one of which was allocated 
to the tail while the other two were capsid components. 
These two shared their N-termini (29). Maldi-Tof mass 
spectrometry indicated that the smaller (gp5A) is the 
product of gene 5 translation. Its large counterpart (gp5B) 
is originated through a —1 ribosomal frameshift at the 
penultimate codon of or/5 mRNA, resulting in a product 
that is 85 amino acids longer than gp5A (30). This is similar 
to the situation described for several E. coli phages, exempli¬ 
fied by T3 and T7, where gene 10 yields two major coat 
proteins (17, 18) (see chapter 20 for a review of phage T7, 
andT3, biology). Both gp5Aand gp5B appear to be essential 
for phage viability because lysogens harboring prophages 
that produce only one or the other protein become lysed 
upon induction with mitomycin C but no viable phage 
progeny are observed (30). 

Additional information on the morphogenesis of Lacto¬ 
bacillus phages comes from phage A2, whose terrni- 
nase small subunit has been characterized. The protein 


specifically recognizes a short DNA segment that includes 
the cos region of the phage. In an ATP-dependent reaction, 
the terminase small subunit induces bending in the cos 
region prior to DNA staggered cleavage at cosN (27). 


Lysis 

The lysis modules of six different phages (mvl, cj>adh, 4>gle, 
PL-1, LL-H, and SC921) had been investigated (11, 34, 42, 
43, 77, 87) due to their interest for accelerated food ripen¬ 
ing (19, 72). However, the first cell wall lytic activity was 
purified from PL-l-infected cultures of L. casei, with the 
aim of using it in protoplast generation (as a prerequisite 
for DNA transformation). The enzyme turned out to be an 
endo-N-acetylmuramidase of 37 kDa, mostly active on the 
cell wall of the phages host (80). However, Kashige et al., 
(43) reported the sequence of a PL-l-encoded acetyl- 
muramoyl-L-alanine amidase, casting doubt on the origin 
of the previously isolated lysozyme-like enzyme. The PL-1 
amidase gene is included in a bicistronic operon and is 
preceded by a putative holin gene, the typical gene arrange¬ 
ment of the phage lysis cassettes in lactic acid bacteria. 
Curiously, this is the only amidase identified so far from 
phages of lactobacilli; all other investigated lysins are 
muramidases. 

Structurally these muramidases are rather diverse, 
their sizes ranging from 195 amino acids (mvl) (11) up to 
442 residues ((j>gle) (34, 42, 64, 77). Usually they are most 
active on the walls of their host bacteria. For example, LysA 
from phage mvl and Mur from phage LL-H efficiently 
degrade the peptidoglycan of thermophilic lactobacilli 
(their host, L. delbrueckii, belongs to this group) but are inac¬ 
tive on walls of L. casei or L. lactis (11, 77). In the case of the 
phage cfigle lysin, several acidic residues and a serine, all 
located in its N-terminal moiety, were essential for the 
lytic activity while deletion of the 45 residues nearest the 
C-terminal did not abrogate lysis induction. 

A striking feature of the 4>gle lysin is the presence 
of a signal peptide of 26 amino acids cleaved from the 
mature enzyme. This suggests that secretion of the lysin 
through the cytoplasmic membrane occurs via a Sec-A- 
dependent mechanism (42). However, the 4>gle lysis cas “ 
sette also harbors the gene for a functional holin, as 
determined by the formation of cell ghosts upon expres¬ 
sion of this protein in E. coli (63). This poses the question 
on whether both lysin export systems might be func¬ 
tional in the L. plantarum host cells. The holin of phage 
(j)gle, is a basic polypeptide of 142 residues that contains 
three hydrophobic domains in its N-terminal moiety 
which promote insertion of the protein in the cell mem¬ 
brane as a prerequisite for pore formation, which is also 
dependent on the C-terminal part of the protein (63). See 
chapter 10 for a review of holin-mediated phage-induced 
bacterial lysis. 
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Lactobacillus Prophages 

Prophages represent a sizable part of the strain-specific 
DNA in many bacteria (for recent reviews, see 15, 16a). 
Since phages are prominent mobile DNA elements, this 
observation was not unexpected. Two not necessarily exclu¬ 
sive hypotheses were proposed with respect to the evolu¬ 
tionary role of prophage DNA in bacterial genomes. The 
present prophage content could represent snapshots in 
the arms race between bacteria and phages where selfish 
DNA elements litter the bacterial genomes. In this model, 
prophages are constantly acquired by lysogeny and, since 
they represent a threat to the survival of the cell (prophage 
induction) and a metabolic burden (extra DNA for repli¬ 
cation), they are constantly removed by selection using 
a nonspecific deletion process (50). In an alternative model, 
prophages play an active and sometimes mutualistic role 
for short-term bacterial evolution. In fact, prophages con¬ 
tribute a constant influx of extra genes (lysogenic conver¬ 
sion genes) into a given bacterial species and thus allow 
the selection of strains that are best adapted to a given 
environment (13). Prophage genomics in some pathogenic 
bacteria such as Streptococcus pyogenes supports the second 
model. They carry genes, expressed during lysogeny, 
that encode superantigens/toxins, mitogenic molecules/ 
Dnases, and other virulence factors involved in disease 
(8,10, 36). 

Do prophages also play a role for the ecological adapta¬ 
tion of nonpathogenic microorganisms? To address this 
question, the genomes from two recently sequenced Lacto¬ 
bacillus commensals were investigated for their prophage 
content. The gut commensal L. jolmsonii strain NCC533 
contained two apparently complete, but noninducible 
prophages, Lj 928 and Lj965, and a prophage remnant in 
addition to numerous isolated phage-like integrase genes 
(79). The two complete prophages were classified as c(>Sfill- 
like pac-site phages (12). However, sequence similarity 
between them was restricted to a few genes. The similarity 
included, surprisingly, the phage integrase that appar¬ 
ently obliged Lj 965 to use a secondary attachment site 
leading to the inactivation of a tRNA gene (79). Both 
prophages contained extra genes of unknown function 
within the lysogeny module. Further extra genes was 
inserted upstream of the structural gene module and 
encoded several tRNAs. Most of the extra genes were con- 
stitutively transcribed in the lysogen while the rest of the 
prophage genome was transcriptionally almost silent. 
When the DNA from eight L. jolmsonii strains showing 
distinct pulsed-field restriction patterns was hybridized 
to a microarray of the sequenced strain NCC533, it was 
observed that the prophage DNA contributed about half of 
the strain NCC533-specific DNA (79). 

The oral L. plantarum isolate WCFS1 (45) contained two 
complete prophages, Lpl and Lp2, also members of the 
4>Sfill-like group of p«c-site Siphoviridae (78). They shared 


long stretches of DNA sequence identity, including one of 
about 10 kb over their structural gene modules. 

Phages Lpl and Lp2 lacked sequence relatedness with 
the L. plantarum phage 4>gle. This observation suggests 
the existence of two lineages of <j>Sfill-like pac-site Siphovir¬ 
idae in L. plantarum. Genes without obvious phage links 
were located by comparative genomics analysis in the 
lysogeny module and between the lysin gene and the 
right attachment site. Notably, two of these genes shared 
sequence similarity with likely lysogenic conversion genes 
from Streptococcus pyogenes prophages (including mitogenic 
factor-like genes). This extra DNA, which again included 
tRNA genes, belonged to the few transcribed segments of 
the prophage genome. WCFS1 also contained two prophage 
remnants. Remnant R-Lp3 consisted of a few genes from 
the lysogeny, DNA replication, head and head-to-tail joining 
genes of a typical Sfl21-like cos-site phage, but it lacked 
sequence similarities with phages (j)adh and A2, suggest¬ 
ing the existence of a second lineage of cos-site Siphoviridae 
in Lactobacillus. R-Lp3 suffered extensive gene losses, fusion 
of gene fragments, and DNA rearrangements. R-Lp 3 abutted 
Lp2 and showed no transcriptional activity. Another short 
prophage remnant, R-Lp4, showed incomplete lysogeny 
and DNA replication modules and a gene with a database 
link to an anonymous gene from an Enterococcus faecalis 
pathogenicity island. 

The observations with Lactobacillus prophages concur 
with recent predictions on the role of prophages for the 
genome evolution of the bacterial hosts. Elements of an 
arms race between phages and their hosts are apparent: the 
bacterial counterattack is, for example, suggested by the 
apparent inactivation of prophages, and the presence of 
prophage remnants and many isolated phage-like inte¬ 
grase genes. At the same time, prophage analysis suggests 
signs of genetic cooperation. For example, there is a con¬ 
stitutive transcription of candidate lysogenic conversion 
genes (14). 

Ecology 

Lactobacillus bacteriophages are of substantial research 
interest since their host bacteria are involved in numerous 
industrial food fermentation processes. In the dairy industry, 
L. delbrueckii is used in a coculture with Streptococcus 
thermophilus for the production of yogurt. While phage 
attack is a serious problem for S. thermophilus, L. delbrueckii 
is a relatively rare target of phage attack in industrial fer¬ 
mentations, suggesting that some kind of defense against 
phage infection might be operative in these strains. 

Sauerkraut fermentation is spontaneous and relies on 
bacterial epiphytes present on cabbage. An ecological 
survey demonstrated a succession of two phage populations 
corresponding to the replacement of Leuconostoc spp. 
by Lactobacillus spp. within the fermenting vats (53, 86). 
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Lactobacillus phages have been isolated from numerous 
other spontaneous fermentation processes involving coffee, 
pickled cucumbers, cereals, wine, and meat (52, 61). How¬ 
ever, phage infections during salami production had no 
industrial impact. After an initial rise, the phage titers 
dropped and the initial phage-sensitive starter population 
was replaced by a phage-insensitive mutant derivative 
strain (60). 

In addition, lactobacilli are normal inhabitants of 
the alimentary and urogenital tract of man and many 
animals. Lactobacilli were therefore proposed as probiotic, 

i.e., health-promoting bacteria (70). The production of 
probiotic strains can be delicate since they represent mono¬ 
cultures that may be especially susceptible to phage attack. 
In practice, some were produced for long periods in the 
absence of phage problems. A possible reason for this phage 
resistance might be their polylysogenic nature. 

However, phage attack on Lactobacillus commensals 
was suggested to have health consequences (44). Lactoba¬ 
cilli constitute the dominant bacterial microbiota in the 
vagina and are beneficial to womens health since they 
inhibit the growth of harmful microorganisms. About 
50% of women with bacterial vaginosis carry lysogenic 
lactobacilli in their vagina. The mutagen benzopyrene 
created by tobacco smoking induced phages from lyso¬ 
genic lactobacilli at concentrations that were found in vagi¬ 
nal secretions of women who smoked (65). This led to the 
intriguing scenario that smoking might reduce vaginal 
lactobacilli by promoting phage induction, which would 
lead to an overgrowth by anaerobic bacteria and thus to 
bacterial vaginosis. 

For the genomics researcher, Lactobacillus phages are 
an interesting group for two reasons. First, sequencing 
data are available for phages from five distinct species 
of lactobacilli (L. delbrueckii, gasseri, plantarum, casei, and 
johnsonii). Second, in contrast to most phages that show 
narrow host ranges (however, see chapter 46), several Lacto¬ 
bacillus phages can infect two or more members of the 
genus. One sauerkraut phage isolate infected ecologically 
related L. brevis and L. plantarum (53) strains and indu¬ 
cible prophages from vagina showed a wide host range 
including L. crispatus, jensenii, gasseri, fermentum, and vagi¬ 
nalis (44). This property offers substantial opportunities 
for phage-mediated lateral gene transfer between different 
lactobacilli. 

Outlook 

Research in the field of Lactobacillus phages has been 
driven by very different motives leading to a situation 
where some data are available for many phages but no 
comprehensive dataset exists for a type phage. This is a 
clear disadvantage for classical phage biologists reared on 
the diet of reductionist thinking. However, phage biology is 


currently at a turning point from serving as simple model 
systems in molecular biology to providing handy tools for 
the exploration of complex ecological relationships. In this 
respect, Lactobacillus phages have something to offer: they 
comprise many different phage types and infect hosts 
that are found in many and diverse ecological settings 
(several of them with clear economic and medical interest). 
Two assets can still be quoted from this chapter’s introduc¬ 
tion. First, in contrast to lactococci and lactic streptococci, 
there exists an extensive literature on the roles played by 
lactobacilli in their different habitats. Second, the genome 
sequences from more than 10 different Lactobacillus species 
will soon be available. What we need now is a comparable 
sequencing effort for Lactobacillus phages and the stage will 
be set for a genomics-oriented molecular ecology of phage- 
host interactions on the mucosal surfaces of humans 
(alimentary and genital tract) or on plants. Lactobacillus 
phage research could thus become a meeting point for 
scientists currently working in separate fields (dairy science, 
food microbiology, medical microbiology, ecology). As an 
example, dairy microbiologists have extensively explored 
the possibilities for protecting starter bacteria against 
phage attack, while medical microbiologists have started to 
seriously investigate the killing potential of lytic phages 
on bacterial pathogens. What about cross-fertilizing the 
activities between dairy and medical microbiologists by 
developing phage-resistant probiotic lactobacilli? 
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Control of Bacteriophage in Commercial 
Microbiology and Fermentation Facilities 

GREGG BOGOSIAN 


T here are an estimated 5 x 10 ;<l bacterial cells on Earth 
(29), suggesting a global population of about 1 x 10 31 
bacteriophages. Roger Hendrix, in his characteristic whim¬ 
sical fashion, has calculated that if bacteriophages were 
the size of cockroaches, they would cover the surface of the 
Earth in a layer 50,000 km deep. This type of thinking 
sends chills through those who work in commercial fermen¬ 
tation facilities, where bacteriophages really are consid¬ 
ered cockroaches. 

Experienced members of the commercial fermenta¬ 
tion industry live in constant fear of bacteriophage con¬ 
tamination. Products obtained from the large-scale 
fermentation of bacteria include a wide variety of foods, 
pharmaceuticals, vitamins, solvents, enzymes, and more. 
The loss of an entire fermentation batch to bacteriophage 
contamination is a dramatic and unsettling event, often 
leading to such heavy bacteriophage contamination of 
the fermentation facility that further fermentation runs 
are not possible for an extended period of time. In addi¬ 
tion, raw materials, isolation and purification equipment, 
and finished product may be contaminated by bacterio¬ 
phage. The economic setbacks from product loss, raw 
material spoilage, and nonproductive operation costs can 
be substantial. Bacteriophage contamination can become 
an extremely distressing and frustrating problem for 
fermentation-based industries. There have been instances 
when bacteriophage outbreaks have taken months to 
bring under control. This chapter explores the possible 
causes of bacteriophage outbreaks, and the most effective 
approaches to handling the problem. 


Extent of the Problem of Bacteriophage 
in Industrial Fermentations 

Industries based on large-scale bacterial fermentations 
can be sorted into three classes, representing increasing 


levels of culture purity and media sterility: food produc¬ 
tion, commodity chemical production, and the biotechnol¬ 
ogy and pharmaceutical industries. Materials such as 
silage, compost, and treated sewage can also be considered 
products of large-scale bacterial fermentation, and are 
subject to upset by bacteriophage infestation, but the 
scope of this chapter is the threat of bacteriophage to the 
growth of commercial bacterial strains in controlled 
fermentation vessels. 

In principle, any bacterial population is susceptible to 
attack by bacteriophage, and all three classes of the com¬ 
mercial fermentation industry suffer from the problem. By 
far the greatest problem is in the food industry, especially 
in the production of cheeses, yogurt and other fermented 
dairy products by lactic acid bacteria (mostly streptococci). 
The number of dairy-product fermentation batches lost 
to bacteriophage has been estimated to be in the range of 
1-10%, probably due to the nonsterile fermentation condi¬ 
tions employed, along with raw starting materials that 
can be contaminated with bacteriophage at levels too low 
to be detected. The industries employing bacterial fer¬ 
mentations to produce commodity chemicals, including 
amino acids, vitamins, organic acids and alcohols, enzymes, 
solvents, biopolymers, and antibiotics, have also experi¬ 
enced considerable problems with bacteriophage. The strug¬ 
gles with bacteriophage contamination and control in 
these older fermentation industries have been extensively 
reviewed (1,4, 5,10,11,16,17,19,21-23). 

The advent of the recombinant DNA era in the late 
1970s gave rise to the new biotechnology industry. By the 
early 1980s this industry was manufacturing commer¬ 
cial products from fermentation of recombinant bacteria, 
especially Escherichia coli strains. There has been only 
general mention in two of the more recent reviews (19, 23) 
of problems with bacteriophage in the biotechnology indus¬ 
try for three reasons. First, the industry is only about 
20 years old. Second, the high containment employed for 
these fermenters provides a higher degree of protection 
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from bacteriophage contamination. Third, no papers on 
bacteriophage problems in the biotechnology industry 
have been published to date. 

For this review the author drew on his own experi¬ 
ences and those of his colleagues from 20 years in the 
trenches with bacteriophage, and also (with the generous 
help of the Society for Industrial Microbiology) queried 
many individuals in the biotechnology fermentation indus¬ 
try about encounters with bacteriophage. Representatives 
of 34 companies replied to the query, the results of this 
survey being that 24 companies reported they had experi¬ 
enced fermentation losses from bacteriophage and 10 
reported they had not. The number of biotechnology fer¬ 
mentation batches lost to bacteriophage was about 0.1%, 
which is 1-2 orders of magnitude less than in the dairy- 
products fermentation industry. Many of the respondents 
provided unpublished observations that have been incor¬ 
porated into this chapter. 

Symptoms of a Bacteriophage- 
Contaminated Fermentation Process 

In a controlled fermentation vessel, there are many possi¬ 
ble symptoms of attack by bacteriophage. These symptoms, 
of which several may be present, are due to the inhibition of 
bacterial growth and lysis of the culture. They include a 
drop in turbidity, an increase in viscosity, excessive foaming, 
a rise in dissolved oxygen, a drop in carbon dioxide genera¬ 
tion, and decreased demand for temperature control, pH 
adjustment, and nutrients such as ammonia or the carbon 
source. 

Trained and experienced fermentation personnel can 
recognize bacteriophage contamination by both direct 
visual inspection and ordinary light microscopic examina¬ 
tion of a sample of the culture. The sample may be more 
viscous than normal, and may appear grainy. In some 
cases, the turbidity of the culture may be so reduced that 
the sample takes on the appearance of uninoculated culture 
medium with particulate matter in it. Microscopic examina¬ 
tion may reveal that the sample contains fewer than the 
normal number of bacterial cells, and cell debris may 
be present. Prolonged movement (streaming) of the cells 
and debris may be observed, behavior associated with 
the high viscosity of the sample. 

Further confirmation of the presence of bacteriophage 
can be obtained in an easy but not necessarily quick test. 
The suspect culture is centrifuged at low speed to pellet 
the cells and debris, and passed through a 0.2 pm filter to 
ensure complete removal of cells. A small amount of filtrate 
(e.g., one drop, or 50 pi) is then spotted on a lawn of appro¬ 
priate indicator bacterial cells, usually prepared in a top 
agar. To ensure appropriate susceptibility to any bacterio¬ 
phage present, the best choice as bacterial indicator would 
be the same strain that is being fermented. One or more 


plaques in the area of the spotted filtrate indicates 
the presence of bacteriophage, but this is usually not evi¬ 
dent until after overnight incubation. If the filtrate contains 
many bacteriophage, then the spotted area will appear as 
a clear area in the lawn due to the presence of overlap¬ 
ping plaques. It is possible with lawns that have been 
prepared hours ahead of time to get results in as little as 
1 hour. The test can also be run as a mixed pour plate by 
mixing a small volume (e.g., 0.1ml) of the filtrate with 
the bacterial cells and top agar before pouring the lawn. In 
this case, a positive test would be the appearance of one or 
more plaques in the lawn, or many overlapping plaques 
and thus confluent lysis and total clearing of the plate. 

The appearance of individual plaques can be taken as 
a definitive finding of bacteriophage. However, it should 
be kept in mind that other fermentation upsets, such as 
the inadvertent loss of pH control or the introduction of 
an inhibitory chemical such as a detergent or an anti¬ 
biotic, can give similar fermentation symptoms and local 
clearing of the bacterial lawn in a spot test or total clearing 
in a mixed pour plate test. If there is any reason to suspect 
such a problem, a more rigorous bacteriophage test would 
be to prepare several 10-fold dilutions of the culture fil¬ 
trate, and mix these with the indicator bacterial cells and 
the top agar before pouring the lawn. In such a test, true 
bacteriophage contamination would be evident by the 
presence of individual plaques at the higher dilutions. 

Consequences of Bacteriophage 
Contamination 

Low levels of bacteriophage contamination do not pre¬ 
cipitate the fermentation irregularities listed above, and 
may escape notice in the absence of a spot test. Indeed, in 
the food and some of the commodity chemical fermen¬ 
tation industries, low-level bacteriophage contamination 
is tolerable in that acceptable product is still obtained. 
For almost all of the biotechnology industry, however, no 
level of bacteriophage contamination is acceptable due 
to the potential for process and product adulteration, and 
every fermentation batch is subjected to a bacteriophage 
test as a standard quality control measure. 

Fermentation batches lost to bacteriophage mean the 
loss of both product and raw materials used to prepare 
the culture medium. In addition, there are the costs asso¬ 
ciated with the cleanup and the losses resulting from any 
downtime in the facility. Subsequent quality control investi¬ 
gations and implementation of corrective action plans add 
to the costs associated with a bacteriophage outbreak. 
With a valuable fermentation product, such as a pharma¬ 
ceutical, the cost of a brief encounter with bacteriophage 
could be in the millions of dollars. 

It is reassuring to inform facility personnel that bacterio¬ 
phage pose absolutely no risk to humans. Indeed, there is 
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renewed interest in bacteriophage therapy and other appli¬ 
cations of bacteriophage, where studies have shown that 
exposure to bacteriophage does not pose any risk to 
humans (3,12, 26, 27; and chapter 48). 

Control of Bacteriophage Contamination 

Keeping Bacteriophage Out of a 
Fermentation Facility 

Most fermenters are designed to be closed systems and 
the contents would not normally be exposed to environ¬ 
mental contaminants such as bacteriophage. However, no 
system is perfectly closed, and several potential sources of 
bacteriophage entry have been identified. 

The culture medium in the fermenter is normally heat- 
sterilized in place, and would be bacteriophage-free. The 
addition to the fermenter of nonsterile solutions, or solu¬ 
tions which have been filter-sterilized, could introduce 
bacteriophage. The inocula should not be overlooked in 
this respect, and the cell banks used to prepare the 
inocula should be tested for bacteriophage before use. 
Some unsterilized raw materials are essentially self- 
sterilizing; for example, ammonium hydroxide at the stan¬ 
dard concentration of 29%, commonly used for pH control 
and as a nitrogen supplement, inactivates bacteriophage 
with a decimal reduction time (the time required for a 
10-fold drop in bacteriophage titer) of about 2 seconds. 
Solutions with a pH of less than 5 or greater than 10 would 
have similar self-sterilizing behavior with respect to most 
bacteriophage (18). 

For many biotechnology fermentation culture media, 
antibiotics are included to maintain selective pressure for 
the plasmid expression system. These antibiotics are also 
very effective at keeping the culture free of bacterial 
contaminants, so much so that complacency may set in 
among the operators. It is imperative that facility personnel 
be reminded not to relax their vigilance against bacterio¬ 
phage, which of course would be completely impervious to 
antibiotics. 

The fermenter components should be maintained in 
good working order and inspected regularly. One common 
source of bacteriophage is chilled water, running through 
coils in the fermenter to control culture temperature; leak¬ 
ing cooling coils could introduce bacteriophage. In some 
facilities, the frequency of bacteriophage attack correlated 
with bacterial load in the chilled water. The bacteria were 
not necessarily directly related to the bacteriophage but, 
rather, served as an indicator of the effectiveness of the 
water-treatment system supplying the chilled water. The 
occurrence of an attack was also correlated with the 
chilled water having a higher than normal biological 
oxygen demand (BOD), indicating a high organic matter 
content. Additional water treatment, maintenance, and 


purging of the system eliminated the problem. Sodium 
hypochlorite is sometimes added at very low concentra¬ 
tions to chilled water systems to control microbial contami¬ 
nation, but the corrosiveness of this chemical may cause 
leaks in the cooling coils. 

Potential bacteriophage contamination of the air sup¬ 
ply is a critical weak point, being difficult to treat. Air 
intake filters, even though their average pore size is 
slightly larger than most bacteriophage, can remove low 
levels of bacteriophage from the incoming air (21). The loca¬ 
tion of the air intake system is critical in minimizing the 
numbers of bacteriophage entering the system (see below). 
Humid air (or wet weather) makes air filters less effective. 
It has also been observed that if air filters are depressur¬ 
ized too fast after sterilization, then they can rupture and 
be rendered much less effective. Frequent integrity testing 
of air filters is also advisable. Heat treatment of incoming 
air is highly effective against bacteriophage, but not practi¬ 
cal when large air volumes are required. Good micro¬ 
biological and cleaning practices (see below for cleaning 
recommendations) help to keep bacteriophage out of the 
fermentation facility. While it may seem obvious to employ 
cleaning agents that include a disinfectant effective against 
bacteriophage, agents which are effective against bacteria 
but not against bacteriophage (see below) still have utility. It 
is important to prevent host bacteria from contaminating 
the facility and providing a reservoir where bacteriophage 
could propagate. 

Monitoring within the facility for the presence of 
bacteriophage should be performed at least once every 
2 weeks. In this procedure, small pieces of sterile absorbent 
filter paper are used to wipe the test area, then are soaked in 
water to extract any bacteriophage that may have been 
picked up, the extract passed through a 0.2 pm filter to 
remove larger particles, and the filtrate used in a spot test 
or mixed pour plate test. These tests are described above. 
The same criteria should be used to interpret the test 
results, including consideration of the possibility of false 
positive results from the wipe test picking up a chemical 
inhibitory to the indicator bacteria. Any positive results 
should provoke a bacteriophage action plan, consisting at 
least of increased cleaning and monitoring. Key areas to 
monitor and clean include floors, sinks, drains, and any 
instruments or areas where the bacterial culture is handled 
or samples stored, such as spectrophotometers, centrifuges, 
and refrigerators. 

The design of the fermentation facility is important 
in minimizing the risk of a bacteriophage outbreak. It 
should be easy to clean and keep dry. Most bacteriophage 
are susceptible to desiccation but can exist indefinitely in 
wet environments. Thus, standing water and damp areas 
should be eliminated. Spills should be cleaned up, and 
leaks fixed promptly. Hidden wet areas should be identified 
and dealt with. Floor mats are often overlooked as areas 
that can retain dampness. The floor surface should be of 
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a durable material which resists cracking and pitting, and 
which is easy to keep spotlessly clean and dry. 

In one research facility for E. coli fermentations, a hold¬ 
ing tank in the waste system contributed to a chronic 
bacteriophage problem. All of the fermentation wastes, 
including live E. coli cells, entered the holding tank. 
When the tank became full, the contents were pumped 
out, sterilized by passage through a heat-exchanger, and 
then directly dumped into a sewer system. Thus, the tank 
itself was never sterilized. This holding tank supported 
a resident bacteriophage population of about 1 x 10 5 plaque 
forming units (PFU) per milliliter, presumably replicating 
at the expense of the live E. coli cells regularly entering 
the tank. The bacteriophage would occasionally find their 
way up the drains and into the working areas of the faci¬ 
lity. Interestingly, the resident bacteriophage were initially 
of one type, but less than 1 year later had been replaced 
by a completely different bacteriophage, indicating the 
occurrence of species succession. This second bacteriophage 
appeared to adapt to the facility in that within a year a 
variant form appeared which had a spontaneous deletion 
of about 1 kb from the genome of about 49 kb. This new 
form of the second bacteriophage immediately replaced 
the old form, and remained resident in the facility for 
several years. The solution to this problem was a redesign in 
which the tank could be drained and then sprayed inter¬ 
nally with a hot caustic disinfectant (see below). Obviously, 
it is very important to eliminate any possibility of chronic 
bacteriophage contamination that can lead to species 
succession and adaptation to the facility. 

There has been considerable work on the utilization of 
fermentation medium components that might inactivate 
or inhibit the replication of any bacteriophage that enter 
the fermenter (1, 10, 11, 16, 30). The approaches include 
attempting to inhibit bacteriophage absorption to bacte¬ 
rial cells by using low-calcium and low-magnesium 
media; metal ion chelators such as citrate, oxalate, or phytic 
acid: nonionic detergents such as PEG, Tween-20, and 
Tween-80; and adding inhibitors of bacteriophage nucleic 
acid injection such as sodium tripolyphosphate. To date, 
these approaches have been limited by the components 
being expensive, inefficient, or overly specific. 

Utilization of Bacteriophage- 
Resistant Strains 

The use of bacteriophage-resistant strains seems an 
obvious approach to bacteriophage control. However, there 
are several drawbacks. Strains resistant to multiple classes 
of bacteriophage are frequently harder or impossible to 
manipulate by genetic transduction. Resistant strains 
have also been observed to have lower product yields and 
altered flow characteristics affecting downstream pro¬ 
cessing. The problem bacteriophage also can adapt to the 
initially resistant host, rendering it useless. 


There are three approaches to effectively and safely 
using bacteriophage-resistant strains. One is to hold them 
in reserve, only to be used during a bacteriophage crisis. 
A second is to use a mixed culture of strains, each resistant 
to a different class of bacteriophage, in conjunction with 
a rotation schedule in the event of a bacteriophage out¬ 
break (24, 25, 28). The third is to use strains with well- 
understood resistance mechanisms (1) that bacteriophage 
would not be able to adapt to. For example, it is a wide¬ 
spread practice in the biotechnology industry to use strains 
of E. coli with nonrevertible mutations in fhuA (the outer 
membrane receptor for T-odd bacteriophage) in order to 
protect against the dreaded bacteriophage T1 (see below 
and chapters 17, 19, and 20 which cover the biology of 
phages Tl.T 5, andT3/T7, respectively). 

Treating a Fermenter Which Has Been Lost 

to Bacteriophage 

While some bacteriophage are relatively heat-resistant, all 
are rapidly inactivated in solution when heated to over 
80°C (18), and thus a contaminated fermenter culture 
should be heat-sterilized in place. Hot caustic solutions (for 
example, 0.1 M sodium hydroxide at 50°C) are effective at 
eliminating bacteriophage from tanks and other equip¬ 
ment that can withstand such treatment. Apparently all 
commonly encountered bacteriophage are extremely sen¬ 
sitive to acidic (pH 4 or lower) or alkaline (pH 11 or higher) 
conditions (18). While it is thus a fairly straightforward 
matter to inactivate all of the bacteriophage inside a 
contaminated fermenter, with bacteriophage titers in 
fermenters as high as 1 x 10 12 PFU/ml the facility around 
the affected fermenter typically is also heavily contami¬ 
nated. Such contamination of the facility usually occurs as 
a result of routine sampling of the fermenter culture 
before the bacteriophage attack is recognized, as well as 
from any leaks that may be present. During one outbreak, 
a water puddle underneath a contaminated fermenter was 
found to contain bacteriophage at about 1 x 10 8 PFU/ml. 
The fermenter exhaust system can also spread large num¬ 
bers of bacteriophage, making it prudent to implement 
methods to contain the exhaust gases. Such measures 
would include shutting off the airflow as soon as possible, 
watching for and disinfecting any foam that exits the 
fermenter, and using either a scrubber or a filter system on 
the exhaust system. The next section discusses strategies for 
cleaning and disinfecting the rest of the facility. 

Disinfection of Bacteriophage- 

Contaminated Facilities 

A bacteriophage-contaminated facility requires extensive 
and repeated cleaning. The cleaning agents employed must 
include disinfectants effective against bacteriophage. Wet 
and damp areas deserve special attention, but everything 
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in the facility should be cleaned or disposed of if possible. 
Mopping, swabbing, and wiping with generous quantities 
of disinfectant solutions is the general approach. 

There have been numerous studies on the effects of 
various agents on bacteriophage, and several reviews have 
compiled lists of chemicals, radiation, etc., to which bacterio¬ 
phage are susceptible (1,2,9,14, 20,21). However, most of the 
listed agents are limited by being too toxic, too expensive, too 
specific, or only partially effective. Most notably, ethanol is 
relatively ineffective against most bacteriophage, with deci¬ 
mal reduction times (the time required for a 10-fold drop in 
bacteriophage titer) for 100% ethanol of about 2 days and 
for 70% ethanol of about 12 hours. Germicidal UV light is 
highly effective, but only for surfaces it can reach, imparting 
limited utility. Widely used disinfectants with the active 
ingredients sodium o-phenylphenate and sodium p-tertiary 
amylphenate (such as the commercial disinfectant Vesphene 
LI) are totally ineffective against bacteriophage, but do have 
utility in eliminating bacterial reservoirs of bacteriophage 
propagation. 

The bacterial fermentation industry needs bacterio¬ 
phage disinfectants that are relatively safe for the human 
operators, inexpensive, and highly effective against a broad 
range of bacteriophage. There are three types of disinfec¬ 
tants that fit the bill. Sodium hypochlorite at 0.05% (for 
example, a 1:100 dilution of commercial bleaches such 
as Clorox) and formaldehyde (formalin) at 0.02% (for 
example, a 1:100 dilution of the commonly used commer¬ 
cial disinfectant DC&R) are both highly effective against 
apparently all types of bacteriophage and are widely used 
for this purpose. These agents have decimal reduction 
times against bacteriophage of about 10 seconds. Both 
of these agents are also highly effective antibacterials. 
Drawbacks of these two agents are the relative toxicity of 
formaldehyde and the relative corrosiveness of sodium 
hypochlorite. 

The third good bacteriophage disinfectant is less well 
known, but should be more widely employed. It is ascorbic 
acid (vitamin C) with trace amounts of copper to catalyze 
auto-oxidation (13, 15). We call it “ascorbinator” because it 
is so effective against bacteriophage. The working solu¬ 
tion is 10 mM ascorbic acid (1.76 g/1) and 0.05 mM cupric 
chloride (1.7 mg/1): the pH as prepared is about 3.2, and 
is not adjusted further. Cupric sulfate can be substituted 
for the cupric chloride. The solution can be stored at 
room temperature for up to 1 week after being prepared. 
Ascorbinator is highly effective against apparently all 
types of bacteriophage, and exhibits a decimal reduction 
time against bacteriophage of about 4 seconds. It retains 
this level of effectiveness even when diluted 10-fold, and 
thus can be used to disinfect wet areas. Raising the pH 
above 5, however, increases the decimal reduction time to 
about 2.5 minutes. Ascorbinator is relatively ineffective as 
an antibacterial. It is very safe for the human operators, 
and is not corrosive like sodium hypochlorite. 


The mechanism of action of ascorbinator appears to 
be the oxygen-dependent formation of free radicals during 
the auto-oxidation of ascorbic acid, which cause single¬ 
strand scissions in nucleic acids (13). It has long been 
known that ascorbic acid is effective against bacterio¬ 
phage (2) and that copper enhances the effectiveness of 
ascorbic acid as an antimicrobial (31): the subject is reviewed 
in more detail by Eller et al.(6). 

During clean-up efforts, movement of personnel within 
the facility should be controlled, and nonessential personnel 
kept out. It has frequently been observed that shoes are 
a major route of bacteriophage spread; shoe covers and 
changing areas are effective at minimizing this spread, as 
are shallow stepping trays or mats filled with a disinfec¬ 
tant. The movement of wheeled carts, forklifts, and the like 
should also be controlled or steps taken to disinfect their 
wheels. 

It is not possible by cleaning alone to completely elimi¬ 
nate every bacteriophage particle from a contaminated facil¬ 
ity. Cleaning efforts should be as thorough as possible, with 
the recognition that those bacteriophage that are missed 
will become inactivated over time. Dried on a surface, 
bacteriophage T7 has a decimal reduction time of about 
1 hour (18). Most bacteriophage on dry surfaces have 
decimal reduction times ranging from 1 hour to 10 days, 
and thus will eventually disappear from a contaminated 
facility. The one legendary exception is bacteriophage Tl, 
which when dry has been reported to persist almost 
indefinitely (18). Only at high temperatures is appreciable 
inactivation of Tl observed, with a decimal reduction time 
on a dry surface at 90°C of about 14 hours (18). As noted 
above, the main line of defense against a bacteriophage Tl 
outbreak is to employ resistant bacterial strains. 

Factors that Increase the Risk of a 
Bacteriophage Outbreak 

The external environments of soil, water, and sewage are 
a rich source of bacteriophage (8; and chapters 5, 33, and 
45) and should be kept out of the fermentation facility. 
Since bacteriophage are dependent on bacteria for growth 
and reproduction, good control of the strain being fer¬ 
mented will help limit populations of bacteriophage which 
can attack it. Good sample handling and storage, contain¬ 
ment, and waste handling practices are essential. At one 
facility, a bacteriophage outbreak occurred every time 
the fermenter exhaust filter was not changed on schedule, 
or was inadvertently not installed at all. Apparently, the 
release of bacterial cells in the exhaust contributed to the 
propagation of high bacteriophage populations in the envi¬ 
ronment near the air intake. A similar situation was found 
in another facility where the air intakes were downwind 
of the bacterial cell processing plant. For the recombi¬ 
nant E. coll strains used in the biotechnology industry, 
sewage provides a dangerous source of bacteriophage 
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(7, 8, and chapter 45). One E. coli facility, with air intakes 
located downwind from a sewage treatment plant, suffered 
frequent bacteriophage outbreaks. Another E. coli facility 
had air intakes located near a dairy manure lagoon, with 
coliphage present in the lagoon at about 200 PFU/ml, and 
consequent bacteriophage outbreaks until the air intake 
was relocated. 

There is a seasonal aspect to bacteriophage outbreaks, 
with peaks occurring in January-March, and again in 
October-November. One E. coli facility that suffered five 
bacteriophage outbreaks in 14 years had one in January, 
one in March, two in October, and one in November (in 
different years in each case), ft has been noted that there 
are increased numbers of bacteriophage in soil and sewage 
during these times of the year (16). Tom Anderson has 
suggested that spring planting and fall harvesting activities 
may release bacteriophage from disrupted soils. Sherwood 
Casjens has suggested that the migrations of birds may 
play a role. 

Disruption of soil in any season may contribute to 
bacteriophage outbreaks. Several facilities have suffered 
outbreaks during ground-breaking for construction near 
the fermentation areas. In one such case a sewer line was 
accidentally broken during the construction work, an 
event that was quickly followed by a bacteriophage outbreak 
in the facility. In another case, at an E. coli facility suffering 
an outbreak, the culprit coliphage were found at a level 
of about 300 PFU/ml in a puddle at a nearby construction 
site (although it was not possible to pinpoint which was 
the cause and which the effect). One control measure to 
consider would be to cover disturbed ground at construction 
sites in order to minimize the release of bacteriophage. 

Salvage of Bacteriophage- 
Contaminated Cultures 

For most of the biotechnology industry, bacteriophage 
contamination dictates that the culture be discarded and 
any potentially contaminated product destroyed. For some 
food and commodity chemical products, if the culture 
viscosity or residual nutrient levels are not too high to affect 
downstream processing, and if the product titer is high 
enough to economically process, then the batch can be 
salvaged. 

There has been limited inquiry into the possibility of 
adding chemicals to a bacteriophage-contaminated culture 
which would not significantly affect the bacterial strain 
but which would destroy the bacteriophage or interfere 
with their replication (1, 30). Such approaches would 
require either a rapid on-line test for the presence of 
bacteriophage, or a fermentation process that is limping 
along with a low-level chronic bacteriophage contamina¬ 
tion. It is unlikely that any such approach would be 
acceptable in the biotechnology industry. 


Concluding Remarks 

Fermentation facility personnel and laboratory staff 
should be trained to expect and respond to bacterio¬ 
phage outbreaks. While the problem may seem unavoidable 
in principle, some facilities have run bacteriophage-free 
for several years. They practice effective cleaning and moni¬ 
toring, and do not relax their guard or become complacent. 
It is advisable to increase cleaning and monitoring during 
“bacteriophage season", January-March and October- 
November, and during any significant ground-breaking 
near the facility. Prevention measures and response plans 
to a positive monitoring result or the loss of a fermenter 
vessel to bacteriophage contamination should be in the 
form of written standard operating procedures. Eternal 
vigilance is the price of freedom from bacteriophage. 
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P hage, and plasmid, derivatives that had “picked up” 
genes from the Escherichia coli chromosome led 
Campbell (12) to propose his influential model based on 
recombination between circular genomes (see chapter 7). 
According to this model, segments of bacterial DNA were 
added to a phage genome if excision of the prophage was 
by an “aberrant” recombination event. These unexpected, 
or illegitimate (2, 27, 96) events fortuitously created the early 
recombinant clones that became tools at the cutting edge 
of research in the pioneering days of molecular biology. 
They, for example, facilitated the analyses of bacterial 
operons (30, 61) and enabled the amplification of the Lac 
repressor (60). Monitoring the bacterial enzymes speci¬ 
fied by Xtrp phages enhanced our understanding of the regu¬ 
latory mechanisms of bacteriophage X (18, 28, 29), telling 
us much about the control of transcription from the early 
X promoters and suggesting how a phage promoter might 
be harnessed to amplify gene products. While these first 
“recombinant clones" of phages and plasmids were of 
unpredictable content, by 1974 it was possible to design and 
make new combinations of genes from any source in either 
plasmid (15) or phage X vectors (62, 76, 92). This article will 
introduce the use of X promoters within the context of 
the phage genome and follow the use of these and other 
phage promoters as components of genetically engineered 
plasmids, where they are relieved of their essential roles 
in the normal life cycle of the phage and instead used 
to amplify gene products. 

Expression of “Foreign” Genes in 
X Phages 

Phage X Itself 

Genes cloned in phage X, whether via in vivo or in vitro 
recombination, are commonly positioned within the 
“central region” of the genome between genes J and N (see 
figure 43-1), where the DNA may be transcribed soon after 


infection from the leftward promoter (p L ) or, much later, 
from the rightward promoter (p R 0 (see chapter 27). Tran¬ 
scription from the early promoters of phage X (p L and Pr) 
is susceptible to negative control by two phage proteins: 
repressor (Cl) and Cro (see chapter 8). While Cl is essential 
only for lysogeny, Cro protein moderates transcription 
from p L and p R during the lytic pathway (25, 28, 70). Failure 
to moderate the transcription of early genes has a detri¬ 
mental effect on the lytic pathway, including DNA repli¬ 
cation (26). During the normal lytic pathway, concomitant 
with the moderation of transcription from early pro¬ 
moters, 0 protein serves to allow transcription initiated at 
p R ' to proceed beyond a transcription-terminator signal 
(figure 43-1) that separates p R from the late genes (99). 
Q protein is essential for transcription of the late genes as 
a single transcript that continues through the late genes 
into the central region (8,10) (see chapter 9). 

Early experiments demonstrated that transcription 
initiated at p L can proceed through sequences that normally 
signal the termination of transcription (1, 29,83). This robust 
transcription is dependent on the interaction of N protein, 
the product of the phage gene adjacent to p L , with RNA 
polymerase at a specific sequence ( nut L ) within the tran¬ 
script (81). 0 protein also serves to enable RNA poly¬ 
merase to read through termination signals (99). However, 
neither N- nor Q-modified RNA polymerase is blind to 
all terminator signals. The central region of the phage 
genome includes sequences that are orientation-dependent 
terminators of transcription from p L and p R ' (10, 33, 37, 45) 
(see figure 43-1). Effective transcription from these pro¬ 
moters therefore will be prevented if a native terminator 
sequence is retained between the phage promoter and the 
inserted coding sequence of interest, but transcription is 
likely to be unimpeded by potential terminator sequences 
introduced from elsewhere. In early cloning experiments 
the relevant coding sequence was usually flanked by 
significant stretches of additional DNA thereby increas¬ 
ing the chance of interference from a transcription 
terminator. 
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Figure 43-1 Phage X: its genome and transcription circuits. Top: The phage DNA enters the bacterium as a linear molecule 
but circularizes via its cohesive ends. Genes that encode related functions are clustered in the X genome. Genes within the 
“central region,” identified by the black segment, are inessential for propagation of the phage, att, located within the 
central region, identifies the site at which site-specific recombination integrates the phage genome into the bacterial 
chromosome. The map is not drawn to scale (for a more detailed look at the phage X genome, see figure 27-2.) 

Bottom: Transcription of the genome proceeds initially from two promoters, p L and p R . At the earliest stage of infection, 
transcripts initiated at p L and p R (I) terminate at specific signals, indicated by black diamonds ♦ (some rightward transcripts 
initiated at p R reach the second termination signal). Only in the presence of the N gene product does most transcription 
proceed beyond the termination signals into the downstream genes. The conseguent synthesis of Q protein allows 
transcription of the late genes (II) to proceed from p R ' through the adjacent termination signal into the lysis genes and the 
genes that specify the components of the phage head and tail. Transcription of the repressor gene (cl) during the 
establishment of lysogeny is initiated at a promoter, p RE , situated to the right of cro. Repressor synthesis is maintained by 
autogenously controlled transcription from p RM , a promoter that overlaps o R . Indicated are major promoters (•); major 
termination signals susceptible to N- and/or Q-dependent antitermination (♦); and major termination signals that block 
transcription of N- and/or Q-modified RNA polymerase (■). 


Gene Expression in X 

The use of p L for amplifying gene expression could be 
favored by the role of N protein in rendering DNA poly¬ 
merase insensitive to barriers that would otherwise impair 
transcription. Such insensitivity provides a high maximal 
rate of transcription, possibly an order of magnitude higher 
than that from the promoter of the trp operon of E. coli 
(18, 59). This maximal rate, however, is that obtained in 
the absence of both Cl and Cro. In practice, the maximal 
exploitation of p L is not readily reconciled with either 
Cro-moderated transcription during the normal life cycle 
of the phage or the impaired DNA replication seen in the 
absence of Cro (6, 64). 

Many early experiments reporting the amplification 
of gene expression from the p L promoter of phage X 
depended on infection of bacteria with high-titer lysates of 


phage (43, 59). Infection with high-titer lysates can, in 
part, alleviate the need for good replication of the 
phage DNA. This may be useful for a Cro~ phage, but a 
more experimenter-friendly route is via induction of a 
dormant prophage. The efficient excision of the prophage 
from the bacterial chromosome following induction requires 
that the phage retains its attachment site and associated 
site-specific recombination functions. Such derivatives of X 
are readily made by the insertion of DNA fragments within 
the central region to the left of the attachment site (see 
figure 43-1), where in one orientation a foreign gene can 
be expressed by N-dependent transcription from p L (38) 
and in the other orientation by 0-dependent transcription 
from p R ' (63). 

Either infection of E. coli by phage X in the absence of 
repressor, or induction of a prophage, normally leads to 
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lysis of the cell within an hour. To maximize the yield of 
the product from a bacterial gene within X, it is essential 
to delay lysis (61). This can be achieved by a mutation in 
gene S. A defect in this phage gene prevents lysis of the cell 
but permits DNA replication and allows protein synthesis 
to continue for hours (34) (see chapter 10). Amplification of 
gene products from p L has used phages defective in both 
gene S and gene Q (38, 59, 64). In the absence of 0 protein, 
transcription of the late genes, including gene S, is not 
activated, and the protein-synthesizing machinery of the 
cell is not diverted to the synthesis of phage proteins. Ampli¬ 
fication from p R ' has depended on vectors defective in 
gene S and gene E (55, 63). The mutation in gene E serves to 
block synthesis of the major component of the phage capsid, 
and the packaging of phage genomes. 

Transcriptional Interference 

Efficient transcription from phage promoters raises the 
problem of transcriptional inhibition by convergent tran¬ 
scription, particularly when aided by proteins that allow 
RNA polymerase to ignore terminator sequences. The effect 
of convergent transcription on gene expression has been 
monitored by experiments using the p L and p trp pro¬ 
moters within phage X (95); trp expression is impaired by 
transcription from p L and vice versa. Many expression 
vectors include a transcription termination signal that 
prevents transcription from the strong promoter from either 
opposing or reinforcing transcription of other genes. 

Amplification of Useful Enzymes 

Examples in which induction of a X prophage has been 
used to amplify useful enzymes include the DNA ligase of 
E. coil from its own promoter (67); DNA Poll from the 
combined use of its own promoter and p L (64); and T4 
DNA ligase (63) and T4 polynucleotide kinase (Pnk) (55) 
from p R '. The genes for both T4 DNA ligase and Pnk are 
positioned close to the late genes, adjacent to gene /, and 
hence to the left of the natural terminators of rightwards 
transcription (figure 43-1). Efficient expression of polA from 
p R ' was not obtained; the polA gene in this phage is more 
remote from the late genes than those specifying T4 DNA 
ligase and Pnk from p R ' (64). It is possible that a termi¬ 
nator for rightward transcription was retained in the Xpol 
phage. The lysogenic strains referred to remain in commer¬ 
cial use, and provide levels of amplification approaching, or 
exceeding, 5% of soluble cell protein without any manipula¬ 
tion of the cloned DNA fragments. 

Stability and Toxicity 

The amplification of some proteins is toxic to bacterial 
cells. While the DNA polymerase I and DNA ligase genes 
were readily cloned in phage X, and maintained within a 


prophage or during lytic infection, experiments indicated 
that their maintenance in a multicopy plasmid would need 
tight control of transcription. As predicted (43, 64, 77), the 
transfer of the functional E. coli polA gene to a multicopy 
plasmid required the dissection of the gene from its own 
promoter (57). 

Cloning the gene for T4 Pnk (pseT) illustrated that both 
position and orientation of foreign coding sequences with 
respect to the phage promoters can influence vegetative 
propagation of recombinant phages, and the recovery of 
foreign genes. The ps^T gene was initially cloned follow¬ 
ing insertion of a relevant T4 DNA fragment within the 
cl gene of a X vector in the opposite orientation to that in 
which cl is transcribed (figure 43-1); no rightward tran¬ 
scription of the cl region from a X promoter is anticipated, 
and transcription of pse T will be opposed by transcrip¬ 
tion from p RE (see legend to figure 43-1). The X pse T phages 
were readily propagated, but phage growth was impaired 
when the DNA fragment containing psdT was transferred 
to the central region of the X genome. Moreover, each sub¬ 
clone that was examined carried psTT in the orientation 
that avoids transcription from the early X promoter, p L 
(figure 43-1). Even so, the recombinant phages produced 
tiny plaques. In contrast, strains lysogenic for the recombi¬ 
nant phage grew well, and they were stable (55). The 
maintenance of a cloned gene as a single copy is a virtue of 
phage X, a virtue that not only increases the stability of 
the cloned DNA but also facilitates genetic analysis of the 
coding sequence. Some new plasmid vectors (described 
later) now mimic X in this respect, that is they are main¬ 
tained in single copy, but their design permits amplification 
to a copy number of around 30 (84). 

Expression Plasmids that Use a 
X Promoter 

Promoter p L 

Each of the three major promoters of phage X has been 
used to drive transcription of a foreign gene within a plas¬ 
mid vector (see figure 43-2 for examples). Transcription 
from p L is regulated by Cl, but plasmid vectors that use 
p L usually retain nut L , the site that permits N protein to 
modify RNA polymerase (see pKC30 in figure 43-2). 
Traditionally, the control of transcription, as with phage X 
vectors, is mediated by a thermosensitive form of Cl, rather 
than wild-type repressor, and relief of repression is achieved 
by raising the temperature from 32°C to 42°C. Host strains 
lysogenic for a defective (Cro - ) prophage provide the 
temperature-sensitive CL and in one strain the defective 
prophage supplies N protein in the absence of Cro follow¬ 
ing the inactivation of repressor (5). The early cro - strains 
(13) included a deletion that extends from the prophage 
into the bacterial genome, leading to loss of the uvrB gene. 
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Figure 43-2 Simple examples of plasmid expression vectors using X promoters. In each diagram the relevant elements of X 
within the plasmid vector are located within the black segment. Top: Transcription in pl<C30 is driven from p L and is blocked 
by X repressor provided in trans (79). The resulting transcript includes nut L , the site at which N protein can interact with RNA 
polymerase to enable transcription to proceed beyond stop signals. A coding sequence inserted at the Hpal site will be 
expressed if it includes a ribosome binding site. N protein can be provided in trans following derepression of the defective 
cro - prophage that encodes repressor. In the presence of N protein transcription will read through the termination signal t L . 
Middle: Transcription in pCVQ2 is driven from p R , and repressor is provided by the cl gene (cl857) present on the plasmid 
(75). The cl gene is transcribed from p RM , a promoter that is autogenously regulated by X repressor. The X sequence includes 
the Shine-Dalgarno sequence and initiation codon of the cro gene. Bottom: Transcription in pQTE is driven from the late X 
promoter (22). Q gene product is essential for transcription from p R ' to proceed through t R '. Derepression of the lac 
promoter activates transcription of Q. Coding sequences inserted downstream of p R ' will be translated if they have a 
ribosome binding site. 


These strains are sensitive to UV light and grow poorly. 
Recent cro " lysogens retain uvrB (69). They are more 
vigorous and respond better to the presence of N protein, 
though it is not possible to predict whether induction of 
the N + or N~ prophage will give the better amplification of 
protein (69). 

A potential problem with the use of temperature 
induction is the concomitant induction of the heat-shock 
response, a response that includes the activation of pro¬ 
teolytic activity, which can therefore result in degra¬ 
dation of the foreign protein (3, 9). In addition, high 
temperature may have detrimental effects on the structure 
and solubility of protein products. Other methods of reliev¬ 
ing repression have therefore been sought. For a wild-type 
cl gene this can be achieved by treatment with mito¬ 
mycin C or nalidixic acid (85). Alternatively, X repressor can 
be provided by a chromosomally located cl gene whose 
expression is dependent on a promoter taken from the 
trp operon of Salmonella (49, 56). In this case, the addi¬ 
tion of tryptophan represses transcription of cl from p trp , 
thereby leading to activation of transcription from p L . 
Wild-type repressor provides very tight control of tran¬ 
scription from the powerful p L promoter, tighter control 
than that provided by the thermosensitive form, while 


derepression is readily achieved in the absence of the heat- 
shock response. 

Promoter p R 

The second plasmid shown in figure 43-2, pCVO, incor¬ 
porates p R rather than p L (75). The cl gene encoding a 
thermosensitive repressor is included within the plasmid 
vector and is transcribed from its normal promoter, p RM , 
transcription from which is itself activated by Cl protein 
(73). This autogenous regulation serves to maintain tran¬ 
scription of cl, but it should be noted that if this plasmid 
enters a cell that is devoid of Cl, then transcription from 
Prm will be inefficient and that from p R will be efficient. 
This potential problem is readily avoided by the use of 
a host that is lysogenic for wild-type X. The lysogenic 
strain can be cured of its prophage once the plasmid is 
established. 

Promoter p R ' 

Two plasmids have been described in which expression 
relies on the O-dependent transcription from the late X 
promoter p R (22, 71). This system is predicted to provide 
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tight control from an efficient promoter using RNA poly¬ 
merase modified by 0 protein (99). As already mentioned, 
p R ' has been used effectively even within the context of 
a X vector in which transcription must traverse approxi¬ 
mately 20 genes before reading the coding sequence of 
interest (55, 63). In pOTEl, the third vector shown in 
figure 43-2, 0 protein is supplied under the auspices of the 
control system of the lac operon (22). An alternative vector. 
pCEQ3, has transcription of gene 0 under the control of a 
thermosensitive repressor (71). For this vector it was shown 
that induction of 0 was achieved at 40°C in the absence of 
the heat-shock response. The potential promise of these 
systems was supported by the expression of lamB (22) and 
the IFNa2 gene (71), but recent literature appears to 
provide little evidence for the application of p R '. It is 
probable that this system entered the scene when others 
were already refined and their use well established. 

Translation of Messenger RNA 

Transcription from p L in plasmid vectors is seldom a 
problem. Translation, however, cannot be guaranteed: 
the secondary structure of the messenger RNA and conse¬ 
quent exposure of the ribosome binding site (RBS) are 
uncertain (40). Translation of messenger RNA from X 
promoters can rely on a RBS associated with the cloned 
gene. Alternatively, the coding sequence can be positioned 
to take advantage of a RBS (e.g., gene N, cro or ell) within 
the vector. In some cases, a series of fusions is made in 
which the coding sequence is placed close to a Shine- 
Dalgarno sequence within the vector and the fusion that 
gives the highest yield of gene product is selected (78). 
Alternative systems with a higher probability of efficient 
translation are those designed to yield the foreign protein 
as a fusion product. A special example of this is the E. coli 
thioredoxin system in which the p L promoter is controlled 
by the tryptophan-mediated repression of the cl gene and 
the thioredoxin part of the fusion protein directs the 
product to adhesion zones. From there it may be released 
by osmotic shock (49). Such fusion systems are likely to 
conserve a messenger RNA sequence in which the RBS is 
exposed for efficient translation. 

Some Additional Modifications 

Many expression plasmids use the early promoters of X (5, 
35, 48, 75, 77, 79). Some, such as pCYTEXP, include both in 
tandem (4). pCYTEXP is most notable for its inclusion of a 
sequence shown to favor efficient translation of the bacte¬ 
rial gene, atpE. This sequence was named TIR for Trans¬ 
lation Initiation Region and is presumed to present the 
RBS within a favorable secondary structure. The pCYTEXP 
vector was made so that its components can be removed 
and exchanged. This modular construction means that the 
segment including TIR can be excised and its sequence 


subjected to site-directed mutagenesis with the aim of 
changing the sequence to enhance translation of the 
relevant foreign gene. 

Another vector includes the par locus from plasmid 
pSClOl to ensure efficient segregation of the plasmid at 
cell division (51), a feature that may be advantageous in 
large-scale cultures. 

E.coli Plasmids that Harbor a Promoter 
from a Virulent Phage 

The Promoters of T3, T7, and SP6 

Phage X was an early paradigm for the study of gene 
regulation, and it is not surprising that its promoters, 
along with those of the lac and trp operons, were the first 
to be adopted for gene expression in plasmid vectors. The 
promoters of some virulent phages offer quite different 
approaches for controlled transcription. Most virulent 
phages possess powerful promoters and, in many phages, 
these include promoters that are not recognized by bacterial 
RNA polymerase. Phages T3 and T7 of E. coli, and their 
relative SP6 of Salmonella typhimurium, specify their own 
RNA polymerase each of which recognizes a phage-specific 
promoter sequence (see chapter 20 for review of phage T7 
and its relatives). Only rarely do these phage poly¬ 
merases encounter a “promoter-like” sequence in bacterial 
DNA. Also, these very selective RNA polymerases are more 
active than their bacterial counterparts and are seldom 
impeded by termination signals (89). The phage-specific 
polymerases were first used for the synthesis of RNA 
in vitro. Of the three phages mentioned, the RNA polymerase 
of SP6 was especially amenable because this enzyme is 
stable and easy to purify (11), but soon the gene specify¬ 
ing T7 RNA polymerase had itself been cloned and over¬ 
expressed. This enzyme also became easy to purify in large 
quantities (17, 91). An SP6 or T7 promoter within plasmid 
DNA is essentially the sole target for transcription by its 
cognate RNA polymerase (11, 52). 

Such systems were quickly exploited for the synthe¬ 
sis in vitro of biologically active RNA (54). The resultant 
single-stranded RNA can serve, for example, as a transcript 
for translation, as antisense RNA, as a substrate for RNA 
processing or simply for the generation of labeled probes. 
Modifications to the majority of cloning vectors, whether 
plasmid, X, or cosmid, were made so that their cloning sites 
were flanked by two promoters, either T7 and SP6 or T7 
and T3, hence either strand of any cloned DNA can be 
transcribed in vitro. 

The T7 promoter first used by Tabor and Richardson 
(91) and by Studier and Moffat (88) has probably become 
the most commonly used promoter for the amplification 
of gene products. Families of vectors, designated pET 
(Plasmid for Expression by T7 RNA polymerase) devised by 
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Studier, Rosenberg, and colleagues at Brookhaven National 
Laboratory (80, 89) have continued to evolve within the 
biotech company, Novagen. 

The pET Vectors 

The original pET vectors were derived from pBR322 and 
all included the promoter that precedes gene 20 of T 7 (4>10) 
followed by unique sites at which a coding sequence 
can be introduced, so-called cloning sites (figure 43-3). 
Downstream of the cloning site(s) some vectors include T(j>, 
the transcription terminator that follows gene 10 in the 
T7 genome. Others specify, instead, an RNase III cleavage 
site that does not cause termination of transcription but 
allows RNase III to process the transcript (80). 

Some vectors lack a RBS and are essentially transcrip¬ 
tion vectors. Others are designed to favor efficient trans¬ 
lation and to do so they take advantage of the translation 
start signals of gene 10, referred to as S10 (see figure 43-3). 
The product of gene 10, the major capsid protein of T7, is 
made more rapidly than any other phage protein during 
infection, and its RBS is believed to be set within the con¬ 
text of an RNA sequence that favors translation (58, 66). In 
the early vectors, in-frame fusions could be made at early 
or late positions within gene 10 (codon 11 or 260); RNA 


from such fusions is likely to retain the secondary 
structure that favors translation of the gene 10 sequence. 
Alternative vectors include an Ndel site within the second 
and third codons and are used to join coding sequences 
without making changes within the sequence of the foreign 
protein. However, the efficiency of translation of the mRNA 
from these fusions is less predictable because the new 
sequence may change the secondary structure of the RNA. 

More recent vectors are modified to generate gene prod¬ 
ucts fused to any of a variety of tags. The tag may be chosen 
to aid identification, purification or stability of the product. 
They may also include a sequence that encodes a target for 
proteolysis to facilitate separation of the protein of interest 
from its tag. 

Successful control of expression from the T7 promoter 
within the pET vectors is determined by the availability 
of the T7 RNA polymerase. Classically, the provision of 
the T7 RNA polymerase specified by a A, prophage (ADE3) 
has been under the control of the lacUV 5 promoter. 
Induction of transcription from theT7 promoter follows the 
addition of IPTG and the consequent expression of the 
T7 RNA polymerase gene within the resident prophage. 
This system of control provides effective transcription 
from the T7 promoter in the pET vector but allows some 
constitutive (basal) transcription of the RNA polymerase 
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Figure 43-3 The key features of pET vectors. In each diagram the relavent elements of T7 are located within the black 
segment. All pET vectors are based on the promoter of gene 10 of phage T7 (P^io) (88). Some include the ribosome binding 
site (SI 0) of this gene and those shown include the transcription terminator, Tcj>, downstream of gene 10. In the simple 
translation vector the Ndel site permits fusion of a coding sequence to the ribosome binding site of gene 10, while insertion 
at the BamHI site generates a fusion protein. More recent vectors include a series of targets for alternative restriction 
enzymes — a multiple cloning site (MCS). 
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gene. Even a low level of RNA polymerase can be a grave 
problem if the foreign gene encodes a protein that is toxic 
to the bacterium. Various modifications have been used 
to tighten the control of expression from theT7 promoter. 
One modification depends on the presence of T7 lyso¬ 
zyme, which has been shown to be a natural inhibitor of 
T7 RNA polymerase (39, 58) (see chapter 20). In some cases, 
T7 lysozyme, specified on a compatible plasmid, can suffice 
to reduce the basal activity of T7 RNA polymerase in the 
lysogenic strain and allow potentially toxic genes to be 
established in the context of the pET vector (87). It is pos¬ 
sible, though less convenient, to maintain the foreign gene 
in the complete absence of T7 RNA polymerase and acti¬ 
vate expression of the foreign gene by infection with a 
X phage (X CE6) that encodes theT7 RNA polymerase. 

An additional refinement of pET vectors is the inclu¬ 
sion of a lac operator in association with the T7 promoter, 
hence pT7lac (19). The basal level of transcription from 
the T7 promoter is reduced by a lac operator centered 
15 bp downstream from the RNA start. In the absence of 
the bound repressor, or following the addition of inducers, 
transcription from the T7 promoter remains powerful. 
The new pET vectors carry a lacl gene to provide sufficient 
repressor to control both the chromosomal gene for T7 RNA 
polymerase and theT71ac promoter in the multicopy expres¬ 
sion vector. The pET vectors based on the T 7lac promoter 
and the lad gene (figure 43-3) are able to maintain most 
genes in the presence of the X DE3 prophage, even if they 
encode a toxic gene product. In this system, the basal level 
of expression from the T7 promoter is very low and yet 
induction can produce high levels of the toxic gene product. 
In some cases, where this low level of residual expression 
makes the plasmid difficult to maintain, the addition of a 
compatible plasmid encoding T7 lysozyme has stabilized 
the system. More recently, the dual control of plasmid copy 
number and transcription serves to block the production 
of toxic gene products prior to induction (97). The mainte¬ 
nance of pETcoco (a copy control derivative of pT 7lac) in 
single copy, rather than 30 copies, provides a proportional 
(approximately 30-fold) decrease in the basal level of 
transcription from theT7 promoter. 


T7 Promoters in Organisms other than E. coli 

TheT7 system has been used effectively in streptomycetes 
(36, 72). In this case an inducible T7 RNA polymerase 
gene has been placed within the bacterial chromosome 
(J. Altenbuchner, personal communication). Expression of 
foreign genes from a T7 promoter has been demonstrated 
in the cytoplasm of mammalian cells following delivery 
by a viral vector (31) and there is evidence that T7 RNA 
polymerase can be directed to the nucleus by the inclu¬ 
sion of a nuclear location signal (20). When T7 RNA poly¬ 
merase is overproduced in the cytoplasm of yeast, some 


enters the nucleus and elicits transcription from a T7 
promoter sequence in the chromosome (14). 

Other Coliphage Promoters 

Some virulent phages, including T5 and theT-even phages 
(see chapters 19 and 18, respectively), do not specify their 
own RNA polymerase. Instead, these phages impose 
modifications that subvert the transcription machinery 
of their hosts. The modifications are not easily exploited, 
but both the extraordinary strength of a T 5 promoter and 
features that enhance translation of messenger RNA from 
a T4 promoter have been tested in expression vectors. 
For instance, a synthetic T5 promoter in an associa¬ 
tion with an effective RBS has served to produce human 
interferon-y in E. coli. In this example, successful ampli¬ 
fication of the gene product was obtained without any 
control of transcription, and interferon constituted 15% of 
total cell protein (42). 

The pN25 promoter of T5 outcompetes the promoter 
for the /3-lactamase gene by a ratio of 25:1 (90). Avery versa¬ 
tile plasmid vector (pDS6) includes this promoter fused 
to the lac operator (pN25x/o) in the absence of a RBS. 
The cloning sites separate pN25x/o from a transcrip¬ 
tion termination signal taken from phage X. In vivo, tran¬ 
scription from this promoter can be moderated by lac 
repressor, and translation of messenger RNA can be 
obtained if the cloned coding sequence specifies a RBS. 
The power of theT5 promoter dictates that approximately 
95% of the transcripts from plasmid DNA initiate from 
the T5 promoter. In vitro, in the presence of 7mGpppA 
and the four NTPs, the resulting messenger RNA is 
capped and serves directly for translation in a wheat germ 
system (90). 

One plasmid expression system derives from phage T4. 
During the life cycle of this phage, massive quantities of 
gene 32 product are required. Gene 32 is endowed with 
a strong promoter and its messenger RNA includes 
efficient signals for translation. The messenger RNA is 
noteworthy not only because it is translated with high 
efficiency but also because it is unusually stable. Using 
these features in an expression vector (21) it was shown 
that infection with phage T4 could enhance the stability 
of a foreign protein. At least part of this protective 
effect is dependent on the pin gene of T4. The Pin protein 
inhibits the Lon protease and may inhibit other pro¬ 
teases (86). Pin can be provided in trans from a compatible 
plasmid. 

Convergent Transcription 

RNA polymerase complexes traveling from opposing 
promoters may collide if no transcription termination 
signal intervenes. Convergent transcription can impair 
gene expression, more particularly that from the weaker 
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promoter. This inhibition of gene expression can result 
from the interaction of messenger RNA with antisense 
RNA (16) or from the cis-specific, steric hindrance of tran¬ 
scription (23, 95). Within a natural context, convergent 
transcription can make a regulatory contribution to a 
developmental pathway, as documented for phage X (see, for 
example, 46, 53). Within the context of an expression vector, 
uncontrolled transcription from a strong promoter could 
interfere with the expression of an essential gene within 
the plasmid and possibly with the replication of the vector 
(7, 77). On the other hand, transcriptional interference 
has been used constructively to tighten the repression of a 
gene specifying a restriction endonuclease in a cell lacking 
protection from the cognate modification enzyme (65). This 
principle (figure 43-4) was proven to enable the cloning 
of a gene encoding a restriction endonuclease in the absence 
of the modification gene (44). The converse effect — the relief 
of transcriptional interference and the consequent expres¬ 
sion of the gene from the opposing, weaker, promoter — 
was developed as a selection system for sequence-specific 
DNA-binding proteins (24). In this situation expression 
from the weaker promoter was recognized by the acquisi¬ 
tion of a drug-resistant phenotype when a sequence- 
specific DNA-binding protein blocks transcription from the 
opposing promoter. 


37°C, no IPTG, R gene is not expressed 



® cl 857 cannot bind to Ocl 


LacI binds to Olac 


30°C, no IPTG, R gene is not expressed 



Figure 43-4 Plasmid pLT7l< and its use. An illustration of the 
use of pLT7K to clone a gene (R) specifying a restriction 
endonuclease in the absence of its cognate modification 
enzyme. At 37°C in the absence of IPTG the Lac repressor 
binds to the lac operator, and transcription from the T7 
promoter is minimal; the X temperature-sensitive repressor 
is inactivated, and transcription from “P R ” (the X promoter, 
p R ) antagonizes any residual transcription of gene R from the 
T7 promoter. At 30°C, in the presence of IPTG, transcription 
of gene R occurs from the T7 promoter, and transcription 
from the opposing promoter is repressed. This figure is 
taken with permission from Kong et al. (44). 


Noncoliphage Promoters 

Bacillus subtilis is an extensively studied Gram-positive 
bacterium and an attractive host for protein production 
for a number of reasons: it is nonpathogenic, its fermen¬ 
tation properties are well understood, and it is capable of 
secreting proteins, especially those originating from Gram¬ 
positive bacteria. A number of B. subtilis phages have been 
well studied and some advantage has been taken of their 
promoters. The late promoter (A3) of bacteriophage 4)29, 
maintained within a plasmid vector, provides constitutive 
expression during exponential growth of Streptomyces 
lividans (74). This system failed to give high levels of 
agarase in Streptomyces species (68). A hybrid control 
system, derived from a promoter of the virulent B. subtilis 
phage SP01 and the lac operator of E. coli, has been used in 
B. subtilis to give inducible expression of a human leukocyte 
interferon A gene (98). 

Considerable effort has been given to harnessing an 
inducible promoter of a B. subtilis prophage in order to 
take advantage of maintenance as a single copy prior to 
induction (32. 50, 93). The “shot-gun” cloning of a promoter¬ 
less lacZ gene in the temperate phage cj>105 identified a 
strong inducible promoter. Transcription of heterologous 
genes, inserted at the site identified by the reporter, can 
be controlled by a temperature-sensitive repressor and, 
fortuitously, the insertion itself makes the prophage defec¬ 
tive in cell lysis. The site of insertion appears to be function¬ 
ally equivalent to that downstream of p R ' in phage X. 

A strong transcriptional element comprising a pair of 
tandemly arranged promoters from phage 119 of Strepto¬ 
myces ghanaens has been isolated and shown to respond to 
RNA polymerase holoenzyme from E. coli and S. lividans 
(47). However, as mentioned earlier, theT7 promoter system 
is now available in streptomycetes (72). 

E. coli still remains the chief bacterial factory for the 
production of proteins within the laboratory. Neverthe¬ 
less, phage promoters continue to be exploited in a wide 
variety of bacteria for many genetic analyses (94) and tests, 
including their use with the luciferase reporter within 
mycobacteriophages as a sensitive assay for live myco¬ 
bacteria (41, 82) (these methods are additionally reviewed 
in chapters 38 and 46, covering mycobacteriophage and 
phage-based bacterial diagnosis, respectively). 


A Phage-Based Perspective 

The amplification of heterologous proteins is still an 
unpredictable mission: efficient translation of messenger 
RNA is not guaranteed, the protein may fail to assemble 
correctly, and the protein may be a substrate for degrada¬ 
tion by host proteases. Expression of coding sequences as 
fusion products may overcome these problems. 
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Control of transcription has always been of prime impor¬ 
tance for protein amplification. Plasmid systems depen¬ 
dent on an early promoter of X or the T7 gene 10 promoter 
have been refined to combat this problem, but phage X and 
its prophage remain a simple though under-used alter¬ 
native. Early experiments with poIA illustrate the vulner¬ 
ability of E. coli to the expression of genes in a multicopy 
plasmid. The poIA gene with its own promoter is stably main¬ 
tained in a X vector during lytic or lysogenic propagation, 
but could not be maintained in a multicopy plasmid (43,64). 
When the promoter-defective gene was transferred to a 
multicopy plasmid, with transcription of poIA from the p L 
promoter of X controlled by temperature-sensitive repressor, 
the recombinant plasmid remained difficult to maintain (57). 

The role of Cro in the moderation of the early promoter 
of phage X precludes the optimal use of p L from an induced 
prophage. However, the late promoter p R ' has been har¬ 
nessed effectively even when shared with most of the late 
genes of X; T4 DNA ligase and Pnk comprised 5% and 7% 
of soluble cell protein, respectively (55, 63). The expression 
of a heterologous gene from p R ' of a prophage requires 
derepression of p R to produce 0 protein, which is necessary 
for expression of genes from p R . Could the prophage system 
be modified to take even better advantage of p R than 
currently reported? It probably could. A genetic trick to 
make the prophage defective by deletion of the 20 or more 
phage genes that separate the heterologous gene from p R 
would focus the protein-synthesizing machinery on the 
message of interest, and the amplification of the hetero¬ 
logous product would no longer be accompanied by the 
production of many phage proteins. Both proximity to 
the p R promoter and loss of competition from 20 phage 
genes should elevate amplification of the protein of inter¬ 
est. However, the new “copy control” plasmids now offer 
the same benefits as prophage induction. Such plasmids 
provide tight control from a cloned gene in single copy, and 
efficient transcription following plasmid amplification 
(84,97). 
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Phage in Display 

BJORN H. LINDQVIST 


P hage display is a process by which a peptide or a 
protein is expressed as an exterior fusion to a surface 
protein of a phage particle. The peptide or protein seq¬ 
uence can be deduced from its encoding DNA sequence 
that resides in the phage particle or in a transductant. 
Amplification of the DNA of interest can take place by 
phage/transductant propagation or by polymerase chain 
reaction (PCR). By producing large populations of phage 
particles, each expressing a unique peptide or protein, 
peptide/protein libraries can be obtained. Peptides or 
proteins, interacting with defined molecular targets — 
most often proteins — can be isolated from such libraries by 
enrichment through repeated cycles of panning. Hence, 
phage display can be thought of as a “search engine” of 
protein-target interactions. 

The pioneering work of Smith (113) first demonstrated 
surface display of peptides in filamentous phage fd. This 
innovation was extended to peptide libraries of fd and 
M13 (19, 22,108) and phagemid display was introduced (6). 
The display of proteins such as antibody domains and combi¬ 
natorial antibody libraries soon followed (5, 59, 75). From 
the beginning of the 1990s phage-display-related publica¬ 
tions have grown exponentially (128). Several reviews 
(some general, e.g., 58, 125; and many specialized) are 
available. There are numerous reports and several labora¬ 
tory manuals (4) describing development and use of filamen¬ 
tous phage display in identification of peptide or protein 
interactions with simple organic compounds, antibodies, 
receptors, etc. Phage display is also a useful tool in protein 
engineering and directed evolution (44). Then there is the 
large sector of phage antibody display (64) and the more 
recent field of immune profiling and its implication 
for vaccine development (15, 112). Furthermore, complex 
targets such as cells (92) and whole tissues/organs (91) have 
been subjected to phage display analysis. These studies 
explore novel approaches for in vivo homing in gene/drug 
delivery (78), cancer surveillance/treatment (2, 86,103), and 
imaging (127). 

To extend the powers of filamentous phage display to 
other phage systems, phages X, T7, and T4 have now been 


engineered for peptide/protein display (10, 49). This chapter 
will focus on developments in filamentous phage display 
and the emerging X, T4, and T7 display systems. Current 
phage display options are listed in table 44-1. 

Filamentous Phage Display 

The biology of the filamentous single-stranded DNA phages 
(11, fd, and M13) is described in chapter 12. For the sake of 
phage display, these particles consist of one major- and four 
minor-type coat proteins embedding the circular single- 
stranded DNA genome of 6400 nucleotides. The major coat 
protein (P8*) of phage M13 consists of 50 amino acids and 
is present in 2700 copies in the phage particle. Proteins P3 
and P6, which are localized at the infecting tip of the parti¬ 
cle, are present in five copies each while proteins P7 and 
P9 reside at the opposite end of the phage particles. 

The particles infect E. coli F + cells by interacting with 
the F-pili through their minor capsid component, P3, lead¬ 
ing to single-stranded DNA injection and conversion to 
a replication-form DNA (RF-DNA) molecule capable of 
rolling-circle DNA replication. In the case of phage fd, the 
P3 protein recognizes the primary receptor of infection, 
the F-pilus, through its N-terminal domain, D2, while the 
neighboring region, Dl, binds to the C-terminal domain of 
the periplasmic host protein, TolA, presumably after pili 
retraction (21, 45, 99). The Dl and D2 domains are separated 
by a disordered, glycine-rich linker (45,46, 71). This pathway 
of membrane penetration has been suggested to be similar 
for other filamentous phages (45). In the case of the fila¬ 
mentous phage 11, the TolR and TolQ proteins are also 
involved in the infection process (12,13). 

Phage particle assembly takes place at the periplasmic 
membrane in a highly ordered fashion. The single-stranded 
DNA coated with P5 protein is packaged “online” in the 
periplasmic (or outer) membrane by the addition of 

* This gene-product nomenclature will be used for all 

filamentous phages. 
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Table 44-1 Current Phage Display Options 


Phage 

Fusion protein 

Type of fusion 

N-terminal C-terminal 

Filamentous 

P3 

Yes 

Yes 


P8 

Yes 

- 


P6 

- 

Yes 


P7/P9 

Yes 

- 


Jun-Fos 

- 

Yes 

Lambda 

V 

_ 

Yes 


D 

Yes 

Yes 

T4 

Wac 

_ 

Yes 


Soc 

Yes 

Yes 


Hoc 

Yes 

- 

T7 

gpIO 

- 

Yes 


processed P8 protein residing in the membrane while P5 
protein is removed. A normal assembly process results in 
filamentous phage being released from the infected cells 
without effecting cell lysis but instead maintains viability 
and colony-forming ability. 

The P3 protein can be divided into two functional parts: 
the N-terminus required for infectivity and the C-terminus 
(amino acids 198-406) for proper particle morphogenesis 
(18). During morphogenesis the P3 protein (embedded in 
the membrane) plays an important role by terminating the 
assembly and protecting the P6 protein from degradation 
(95). Furthermore, the C-terminal domain of protein P3 is 
critical for membrane release as well as for virion stability 
(93). Particles devoid of P3 show aberrant sizes and are 
obviously noninfectious. Until recently, P3-based protein 
display has been achieved by N-terminal fusions leaving 
the C-terminal part of gene 3 intact, and it was thought 
that this part could not be engineered for display. 

Fuh and Sidhu (37) have subsequently shown, however, 
that the C-terminus of the P 3 protein can be manipulated 
to allow display in a M13 phagemid system. In this case, 
C-terminal fusions were achieved via optimized linker 
sequences. This type of display is suitable for functional 
complementary DNA (cDNA) cloning since it avoids the 
stop codon problem and also for the study of protein-protein 
interactions requiring free C-termini. Prior to the discovery 
of C-terminal domain display by protein P3 in filamen¬ 
tous phage, Crameri and Suter (17) had designed a two- 
component vector system suitable for cDNA display. They 
took advantage of the leucine zipper set, Jun and Fos, as a 
way to forge P3 and the expressed cDNA protein together 
using disulfide bonds located in the zipper. This system has 
been used to isolate allergens from of a number of sources 
via cDNA display (16). 

Proteins P3- and P8-based systems differ in their capa¬ 
city to display proteins. P3-based systems are able to display 
both peptides and functional proteins whereas display by 


P8, the major coat protein, is restricted to small peptides, 
such as a 6-mer. There are a number of reasons for this 
restriction: (i) displays of larger peptide reduces phage 
viability, for example 16 amino acids displayed via the P8 
protein reduces phage viability to 1% (51, 72, 73); and (ii) 
the first five amino acids of the P8 N-terminus can vary 
but their deletion also results in reduced phage viability, 
though insertion of a pentapeptide between position 4 and 
5 did not affect the helical symmetry of the phage and 
the peptide was exposed in an extended form (62). On the 
other hand, proline-rich sequences with large hydro- 
phobic residues at position 7 and Asn at position 1 in the 
P8::peptide fusions were found to enhance particle viability 
(50). The amino acid residues present in the peptides are 
normally accessible except for those located 47 residues 
or fewer from the C-terminus of P8 (119). 

The filamentous phage display vectors, including those 
of the phagemid types, are derived from phages fd, fl, Ml 3 
or modular hybrids of theirs. Vector options for P3 and P8 
display can be described as type 3, 3+3, and 33 as well as 8, 
8+8, and 88 (114). Phagemids are used in 3+3 or 8+8 
display where wild-type P3 or P8 respectively are supplied 
in trans by a helper phage. In type 33 or 88 display, wild- 
type and the recombinant-gene version reside in the same 
phage chromosome. A genetically stable fd type 88 peptide 
display vector (Fthl), giving rise to high titers of recombi¬ 
nant phages, has recently been described (34). 

P8::protein fusions allow less than 1 copy displayed per 
particle, but Nakayama et al. (84) were able to improve this 
type of display 10-fold via mutational “tricks." Recently, 
mutations in gene 8 have been reported to improve up to 
100-fold the display of large proteins including oligo¬ 
meric protein (110). Likewise, display of the Stoffel fragment 
of Taq polymerase as a fusion with P 3 protein was improved 
more than 50-fold by selection of mutations of the signal 
sequence originating from the pclB-encoding pectate lyase 
of Erwinia carotovora that is often present in phagemid 
constructs (56). But this is a case-by-case approach since 
the mutations selected are not expected to support the same 
increase in display for every protein. 

In a commonly used phagemid-based system, dependent 
on a helper phage to supply wild-type P3, only a small 
fraction of the total phage particles show display after 
assembly. To improve the frequency of display, gene 
3-deleted helper phage have been used in combination 
with plasmids providing wild-type P 3 (27, 94). The recently 
introduced M13 helper phage (“hyperphage”) appears to 
be a further improvement of this approach. A 400-fold 
increase in the display of single-chain antibodies has been 
reported using the “hyperphage” (100). In this case there 
is no competition by wild-type P3 during the assembly 
of the virions and the full copy number of P3 display is 
achieved. This approach, however, leads to loss of some¬ 
times useful monovalent display as there are no wild-type 
P3 present to dilute the P3::protein fusion. 
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Display of fused proteins by the P6 protein is now 
possible (55). This system has been proven useful for 
cDNA display as the fusion takes place at the C-terminus 
of P6 which thereby alleviates the stop-codon problem 
experienced by the N-terminal display (1, 36, 49, 111, 123). 
More recently, the proteins P7 and P9 of phage M13 
have been shown to work in functional co-display of 
antibody heavy- and light-chain variable regions in a 
phagemid format (38). In this case the chains were fused 
to the N-termini of the P7 and P9 proteins, respectively, 
and found to interact to form a functional Fv-binding 
domain on the phage surface. The engineering of fila¬ 
mentous phage M13 for phage display has recently been 
reviewed (109). 

Genomic DNA-derived filamentous phage display librar¬ 
ies have proven very useful for identification of bacterial 
proteins as well as domains that interact with a range of 
target molecules (23, 54, 102). Indeed, a novel IgG-binding 
protein has been identified in Staphylococcus aureus, there¬ 
by demonstrating the power of this type of display (131). 
Product toxicity or failure of filamentous phage release in 
turn may hamper the display analysis. Therefore, special 
phagemid systems designed to facilitate toxic-product 
display have been constructed (7). However, the emerging 
display systems based on tailed phages such as X, T4, 
and T7, which are released by cell lysis, offers a novel 
window of display by avoiding translocation of the protein 
fusions across the periplasmic membrane. 

Phage X Display 

In phage X, the major capsid protein, E, makes up an icosa- 
hedral capsid of 415 monomers. A tail connector is posi¬ 
tioned at one of the capsid vertices. The tail is made up 
of 192 copies of the V gene product. However, X lacked tail 
fibers until Hendrix and Duda (43) showed that X PaPa, 
used in all X studies, was a fiberless mutant (for a review of 
phage X, see chapter 27). It was then envisioned that X and 
its tail fibers could be used as a potential display system. 
The first report describing X display, however, used its tail 
protein V for display (74). Xfoo was designed for display at 
the C-terminal end of V Even though it suffered from poor 
display efficiency (a few molecules per particle) and low 
phage yields, display of a homo-multimeric protein such as 
(3-galactosidase was achieved. Xfoo has also been used in 
epitope mapping (65,66, 79). 

Dunn (28) developed a C-terminal X V-display system 
for presentation of a peptide (RRAVS) as target for cAMP- 
dependent protein kinase and in a subsequent report it 
was shown that the V-displaying peptide could comple¬ 
ment a V-defective mutant phage to essentially normal 
phage yields (29). As for Xfoo, only a limited display of 
(3-galactosidase was observed using Dunn’s V-display 
system. In a subsequent study a limited display of the 


a-complementation peptide of the (3-galactosidase 
system was achieved. Such purified a-peptide phages 
also functioned in an in vitro a-complementation assay 
of (3-galactosidase (30). XV display of a RDG sequence made 
the phages able to transfect COS cells at a significant 
frequency (31). 

Several phages are known to strengthen their capsids 
after assembly by the addition of special phage-encoded 
decoration proteins. In the phage X capsid, gpD of 11 kDa 
is present in 405 copies. Deposition of 135 trimeric D units 
on its capsid surface (26) is essential for stable head forma¬ 
tion but certain chromosomal deletions can compensate 
for absence of D. A crystal-structure analysis of D in combi¬ 
nation with cryo-electron microscopy and image recon¬ 
struction reveal that its N-terminus is disordered up to 
Ser 15 whereas the C-terminus is well ordered. Both ter¬ 
mini are positioned on the same side of the trimer that 
binds to the capsid (130). Despite this seemingly awkward 
orientation, D works as a display platform for fusion proteins 
connected to it by linker peptides at either termini, as 
demonstrated by Sternberg and Hoess (116) and Mikawa 
et al. (76). Work in progress also shows that a 15 amino acid 
peptide of the N-terminal sequence of D is able to bind 
the expanded X capsid (K.A. Miroshnikor et al., personal 
communication, 77). 

Sternberg and Hoess (116) first constructed display 
plasmids in which D-fusion expression was under control 
of the ifTrc or the more effectively regulated ara promoter. 
By utilizing the cre-lox recombination system of phage PI, 
the gpD-expressing plasmid with loxP can be picked up 
by an infecting D~ X chromosome containing a loxP site 
in cells expressing Cre recombinase (see chapter 24 for 
a review of phage PI). X particles, with the inserted dis¬ 
playing plasmid, were recovered as ampicillin-resistant 
transductants/lysogens. The lysogens were then induced 
to yield a library of X particles that displayed peptides 
or proteins as N-terminal D fusions (8, 116). This X display 
system has been further engineered for C-terminal D 
display of cDNA (104). A certain instability, however, of 
the X plasmid cointegrates — even in the absence of Cre — 
has prompted the construction of C-terminal D display 
vectors, X 171LoxP~ (105) and X Dsplayl (10. 133). Both 
these vectors have been used for construction and panning 
of cDNA display libraries and X Dsplayl was also compared 
with T 7 cDNA display (133). 

Xfoo has also been used for engineering display at 
the N- or C-terminus of D (76). Plasmid vectors were 
designed for insertions in an engineered site between 
the first and the second codon of D, respectively. 
Plasmids for C-terminal fusion were also constructed 
by allowing insertion immediately after the termination 
codon of the D gene. The termination codon was then 
used to regulate the number of fusion proteins per phage 
particle by conditional chain terminations using E. coli 
suppressors. X mutants lacking the D protein could then 
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be complemented by these high copy D-fusion-expressing 
plasmid leading to particle display. Two phage vectors 
were also constructed for display at both termini of 
gpD: XfooDn (N-terminal) and XfooDc (C-terminal). 
The latter vector has been used for C-terminal display in 
a few studies (117, 118, 132) as well as a A foo derivative 
called XfooDcSfil for C-terminal A display of a cDNA 
library (87). 

Phage T4 Display 

T4 encodes two dispensable structural proteins, Soc and 
Hoc, which bind to the outer surface of mature T4 capsid 
(52). Soc is a 9 kDa protein and Hoc has a molecular weight 
of 40 kDa. Structural analysis including recent cryo- 
electron microscopy image reconstructions (53, 90) places 
Soc as a trimer at the triagonal points of the P23* lattice 
(the processed major capsid protein) whereas a single 
Hoc molecule is found at the center of each P23* hexamer. 
It is proposed that Soc functions like a clamp and pro¬ 
tects the capsid against harsh conditions such as high 
pH and temperatures (see chapter 18 for a review of T4 biol¬ 
ogy). Both Soc and Hoc have been developed as display 
platforms. The first report to demonstrate surface dis¬ 
play in T4, however, used the fibritin proteins encoded 
by the T4 gene wac (whisker’s antigen control). Even 
though the C-terminus is essential for correct trimeri- 
zation and folding of the fibritin protein (67), it could be 
extended by a 53 amino acid fusion to obtain functional 
T4 display (33). 

There are 960 and 160 copies per particle of Soc and 
Hoc, respectively (57). Hence, Soc allows extensive multidis¬ 
play by T4 that can be extended by the use of polyheads 
(98). The rationale and operation of the display system devel¬ 
oped by Ren et al. (98) are similar to those of Sternberg 
and Hoes’s system (116). Namely, Soc and its fusion deriva¬ 
tives are expressed in E. coli, purified, and then bound 
in vitro to polyheads or in vivo using positive-selection 
vectors that force integration of soc-fusion sequences into a 
soc-deleted T4 chromosome. In this case the progeny phage 
carried, among others, a 312 amino acid sequence of polio¬ 
virus capsid protein, VP1, fused to the C-terminus of Soc. 
N-terminal peptide display using Hoc was achieved as part 
of an elegant procedure to clone linear DNA fragments 
in vivo. This work was later extended to a 183 amino acid 
functional N-terminal fusion of Hoc (96, 97). Jiang et al. 
(57) have also developed a T4 display system based on Soc 
as well as Hoc. In this case the N-termini were used 
as fusion points of a 36 amino acid PorA peptide from 
Neisseria meningitidis. The T4 display system appears little 
used so far. 

Black and coworkers have described the develop¬ 
ment of an internal T4 display system based on the 
nonessential scaffolding protein IPIII that allows the 


construction, packaging, and even specific processing 
of proteins within the T4 capsid (47, 80, 81, 82). This 
expression-packaging-processing system (EPP) for internal 
display channels IPIII protein fusions into the T4 capsid 
and offers many applications such as detection and purifica¬ 
tion of proteins free from proteolysis as well as delivery 
of proteins into E. coli cells, and perhaps others. 


Phage T7 Display 

Phage T7 is released by cell lysis and the translocation of 
any fusion protein through the cell membrane/wall is 
avoided. T7 is a robust phage with a short latent period that 
should speed up the selection process. As in the case of 
phages A and T4, the biology of T7 is very well understood 
(see chapter 20) and offers a variety of host-vector systems 
for a range of applications. The T7 capsid is composed 
of 415 copies of the capsid protein: 10 that are assembled 
into 60 hexamers on the faces of the icosahedron structure 
plus 11 pentamers at the head vertices. A short tail (genes 11 
and 12) with its six tail fibers (gene 17) is attached at the 
remaining vertex through the head-tail connector (gene 8). 
The capsid protein is made in two forms: 10A (344 amino 
acids) and 10B (397 amino acids). The production of 10B 
results from a translational frameshift at amino acid 341 
of 10A and there is 10% of 10B in the capsid 

The fact that functional T7 particles can be made 
from either of the capsid protein variants, 10A or 10B, includ¬ 
ing a range of mixtures of the two different protomers, 
prompted the exploration of the T7 capsid protein as a 
C-terminal display platform (101). The vector T7Select415- 
lb is reported to accept peptides up to 40-50 amino acids 
in length for display of 415 copies per particle. There are an 
increasing number of reports in which the T7Select415 
system has been used (11,48,115). 

T7Selectl-lb is designed for display of peptides and 
proteins of less than 1 copy per phage particle and can toler¬ 
ate fusions up to 900-1200 amino acids at that display 
density. In order to achieve the low-copy-number display 
(0.1-1 fusion per phage), the PhilO promoter was removed 
and the original translation initiation site altered. Hence, 
other upstream promoters take over but at a reduced effi¬ 
ciency. Furthermore, a special 10A complementing host is 
needed to achieve the low-copy display. Construction of 
libraries in the T7Select systems utilizes vector arms and 
T7 packaging extracts. The T7Selectl-lb system or its deri¬ 
vatives have been used for the display and panning of 
cDNA libraries against different targets including small- 
molecule chemical probes (42, 60, 85, 106, 107, 129). T7 
display (Selectl-2 series) has also been used to select RNA- 
binding proteins from cDNA libraries (20). In this case, the 
speed at which cycles of panning can be performed by the 
T7 system was mentioned explicitly. 
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Miscellaneous Phage (and Virus) Display 

The tailspike protein (TSP) of phage P22 (phage P22 is 
reviewed in chapter 29), six copies of which are attached 
to the capsid to form the tail, has been used in peptide 
presentation (9). A 13 amino acid antigenic peptide from 
the VP1 protein of foot-and-mouth disease virus was 
joined to the C-terminal end of TSP under control of a trp 
promoter. Both the endorhamnosidase and assembly 
activities of the TSP- fusion were retained, indicating that 
P22-TSP system can be used for display. 

Phage P4 (reviewed in chapter 26) encodes the capsid 
decorating protein Psu, which has been used for peptide 
presentation (68). Psu resides at each hexamer, probably as 
a dimer, thereby stabilizing the mature P4 capsid similar 
to D in phage X and Soc in phage T4 (25). In Qp (reviewed 
in chapter 15), functional virion display has been reported 
within the 195 amino acid extension of the coat protein 
A1 (63, 122) and recently, guided by the MS2 atomic 
structure, peptide display on live MS2 (also reviewed in 
chapter 15) has been achieved (121). Addition of tri- or hexa- 
peptides at residue 100 of poliovirus caspsid protein, VP1, 
represents an implicit early example of virus display (14) 
and phage display has now been extended to both animal 
and plant virus systems (40). 

Concluding Remarks 

Based on current experience, most if not all phage/ 
viral systems can probably be developed for display of 
peptides/proteins. The viability of the engineered particles 
sets the limits of display and in that respect the filamentous 
phages have proven very useful. Results on filamentous 
phage viability in organic solvent-water mixtures open up 
the possibility of employing phage display in nonaqueous 
media (89). 

Although phage peptide display has been very succes¬ 
sful in identifying ligands and epitopes (24, 61, 70) its use 
can sometimes be capricious (83, 120). It remains a chal¬ 
lenge to assemble cDNA libraries for phage surface display 
that fully represent the cellular messengerRNA status. The 
recent engineering of C-terminal presentation in filamen¬ 
tous phage and the tailed phage systems is widen¬ 
ing the window of cDNA display. Phage display is now an 
established technology as witnessed by an increasing 
number of applications, and more innovations are clearly 
due. A perhaps unexpected such innovation was the isola¬ 
tion of peptides with semiconductor binding specificities 
(124). These peptides have been reported to distinguish 
different crystallographic planes of gallium arsenide and 
silicon. Filamentous phage display was recently used to 
select peptides that favored top-phase partitioning of 
phage particles in a PEG/sodium phosphate two-phase 


system (3). In biosynthetic phage display, chemical synthesis 
is combined with the genetic diversity (32). 

In recent years cell-free display systems, such as 
ribosome (41) and messengerRNA display (126), have 
been developed for making molecular libraries in vitro. 
These approaches are dependent on the cis-capture of the 
displayed protein by its own template, but relieved from 
cell-based transformation. Although procedures exist for 
generating high-complexity phage display peptide libra¬ 
ries (>1() 9 ) (88), the in vitro approach should give rise to 
even greater molecular/chemical diversities (35, 69), thereby 
improving the chances of enhancing weak affinities of 
molecular interactions (39). Phage display systems includ¬ 
ing certain libraries are now commercially available, and in 
a review on phage display and the development of tumor 
targeting agents Nilsson et al. (86) provide some comments 
on the intellectual property status of phage display. 
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Bacteriophage as Pollution Indicators 

CHARLES P. CERBA 


B ecause of the difficulty and cost of detecting water¬ 
borne enteric pathogens, indicator organisms have 
been used since the beginning of the twentieth century. 
Much of the work in this area has focused on bacterial 
indicators, such as coliform and fecal coliform bacteria. 
However, it has been recognized in the last 30 years that 
these traditional indicators do not always reflect the 
waterborne occurrence of human pathogenic viruses and 
protozoa. Thus, bacteriophages have been investigated as 
a better indicator of these groups of pathogens in water, 
as models of enteric virus removal by treatment processes, 
and to examine the fate and transport of enteric viruses 
in the environment (12). 

The term “indicator organism” is often not clearly 
defined. By contrast, an “index organism" is usually defined 
as one related to the occurrence of a selected surrogate 
microorganism or microorganisms (table 45-1). The relation¬ 
ship may be direct, such as an index of human viruses, or 
indirect, such as an index of fecal pollution or types of 
fecal pollution (i.e., human or animal) (27). The criteria 
for an index organism are very similar to those commonly 
used for bacterial fecal indicators. An indicator organism, 
on the other hand, is measured to check the performance 
of a treatment process against previously set standards. 
For example, an indicator is used to evaluate the perfor¬ 
mance of drinking-water disinfection for the inactivation 
of enteric viruses. To serve as an effective indicator, the 
resistance of the indicator organism and the pathogen to 
the disinfectant should be similar. 

Three main groups of bacteriophages infecting enteric 
bacteria have received the greatest amount of study in 
the assessment of water quality. These are the somatic 
coliphages, the F-specific RNA coliphages, and the bacte¬ 
riophages infecting Bacteroides fragilis (table 45-2). This 
chapter largely concerns the use of these bacteriophages 
as indicators of fecal pollution and as index organisms 
of pathogenic human enteric viruses. 


Somatic Coliphages 

The use of bacteriophages as a test of fecal pollution 
was originally suggested by Coetzee (3) and Kott (31). 
Kott showed that Escherichia coli B phages were more resis¬ 
tant in the water environment than coliform bacteria 
because the ratio of phages to coliforms shifted from 1:100 
in the vicinity of sewage outfalls to 1:1 to 1:10 at more 
distant locations. The group of bacteriophages infecting 
E. coli B and related host strains are now described as 
somatic coliphages. Natural host strains of somatic coli¬ 
phages include E. coli and closely related bacterial species. 
Somatic coliphages are a heterogeneous group of phages 
belonging to the Mycoviridae, Siphoviridae, Podovirdae, 
and Microviridae families (see chapter 2 for a review of 
phage classification). The specific assay methods determine 
which phages are detected and which are not. In this 
respect, somatic coliphages are a method-defined para¬ 
meter similar to the total and fecal coliform groups of indi¬ 
cator bacteria, with inherent disadvantages with regard 
to method standardization and comparison of results 
from different studies. In addition, somatic coliphages are 
a heterogeneous group with regard to their response to 
environmental factors (i.e., temperature, pH) and at least 
some members of the group are able to multiply in waters 
that are not subject to fecal contamination. However, the 
contribution of this potential replication outside the gut 
to their occurrence in natural environments has never 
been quantified (27). Bacteriophages frequently used to 
study somatic coliphage behavior are T-even and T-odd, 
<j)X174, and PRD1. The biology of these various phages are 
reviewed in chapters 18 (T-even), 17, 19, 20 (T-odd), 11 
(4>X174), and 13 (PRD1). 

Coliphages infecting E. coli C are easily detected in 
the feces of man, cattle, pigs, chickens, and other ani¬ 
mals (5, 16, 21). Phage numbers vary widely between <10 
and 10 s plaque forming units (PFU) per gram. The highest 
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Table 45-1 Definitions for Indicator and Index Microorganisms of Public Health Concern 
Croup Definition 


Process indicator A group of organisms that demonstrates the efficacy of a process 

Fecal indicator A group of organisms that indicates the presence of fecal contamination. They only infer that pathogens 

may be present 

Index and model organisms A group/or species indicative of pathogen presence and behavior respectively, such as E. coli as an index 

for Salmonella and F-specific RNA bacteriophages as models of human enteric viruses 


From Ashbolt et al. (1). 


Table 45-2 Major Bacteriophage Croups Considered Appropriate Virus Models in the Environment 


Description 

Host Strain 

Comments 

Somatic coliphages 

E. coli C (most 
commonly used) 

Heterogeneous group of different morphology; frequent occurrence in human 
and animal feces (10 2 -10 8 g -1 ) and wastewater (10 3 -10 4 ml -1 ); may 
multiply in the environment; good persistence in the environment; readily 
inactivated by water treatment processes (with the exception of a few types) 

F-specific RNA bacteriophages 

S. typhimurium phage 
type 3 

Nal r (F = 42 lac::Tn5), 

E. coli HSjpFampjR 

Homogeneous group, physical properties similar to those of enteroviruses; 
infrequent in human and animal feces (up to 10 3 g 1 ), frequent occurrence 
in wastewater (10 3 —10 4 ml - '); can multiply only at temperatures above 
30°C; 

relatively high resistance; serotypes may be related to the (human or animal) 
origin of fecal pollution 

B. fragilis phages 

6. fragilis HSP40 

Occur only in human feces (up to 10 8 g -1 ); do not multiply in the environment; 
host strain possibly not applicable around the world; relatively low numbers 
in wastewater (< 1 -10 3 ml -1 ); relatively homogeneous group; relatively 
high resistance 


numbers have usually been found in the feces of pigs, calves, 
and boiler chickens. In human feces, somatic coliphages are 
lower in number (often <10 PFU/g) or undetectable. The 
arithmetic mean concentration of somatic coliphages varies 
from 10 4 to 10 7 per gram, which is three orders of magnitude 
lower than the fecal concentration of E. coli (10 / to 10 9 per 
gram) (11). Somatic coliphages have been detected in 
2.5-88% of samples of human feces in studies from Europe, 
United States, Asia, and South Africa (14). Relatively low 
isolation frequencies were found in studies in Japan, possi¬ 
bly related to the choice of a relatively insensitive host 
strain (14). 

Somatic coliphages are the most abundant type found 
in raw, untreated domestic sewage, with values ranging 
from 10 4 to 10 5 PFU/ml (4, 8, 18). They are also found in 
similar numbers in abattoir wastewater and animal slur¬ 
ries (table 45-3). 

F-specific RNA Bacteriophages 

F-specific RNA bacteriophages have simple polyhedral 
symmetry (icosahedron), are 21-30 nm in diameter, and 
contain single-stranded RNA as the genome. They belong 
to the family Leviviridae. They infect bacteria through 


Table 45-3 Typical Concentrations of Enteroviruses, Bacte¬ 
riophages, and Fecal Bacteria in Domestic, Hospital, and 
Slaughterhouse Wastewater 


Microorganism 

Concentration 
(PFU or CFU/ml) 

Enteroviruses 

1 x 10 1 

Somatic coliphages 

1 x 10 4 

F-specific RNA phages 

3 x 10 3 

6. fragilis phages 

1 x 10 2 

<10 -2 in slaughterhouse 
wastewater 

Fecal coliforms 

1 x 10 5 

Fecal streptococci 

1 x 10 4 


PFU, plaque forming units; CFU = colony forming units. 


the sex pili, which are coded by the F plasmid. The F plas¬ 
mid is transferable to a wide range of Gram-negative 
bacteria. The pili encoded by the F plasmid do not form 
below 25°C (37). Therefore, the probability of F-specific 
phages replicating in the environment is low. The infec¬ 
tion process is inhibited by the presence of RNase in the 
assay medium, which can be used to distinguish between 
the F-specific bacteriophages and the rod-shaped F-specific 
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DNA bacteriophages of the family Inoviridae, which also 
infect the host cell through the pili (27). Chapter 15 
reviews the biology of single-stranded RNA viruses, and 
chapter 40 discusses the biology of Inoviridae phage of 
mycoplasma. 

Salmonella typhimurium WG49 and E. coli HS strains 
have been modified to detect F-specific bacteriophages, 
but will also detect a small number of somatic phages 
(17) (table 45-4). All phages detected by the modified 
strains are usually referred to as F-specific bacterio¬ 
phages. The number of F-specific RNA bacteriophages is 
the difference between the number of phages counted 
in the presence and in the absence of RNase in the assay 
medium. More than 90% of the phages detected in sewage 
by the modified strains are F-specific RNA bacterio¬ 
phages (27). 

F-specific RNA phages are less frequently found in 
animal feces than somatic phages. Dhillon et al. (5), using 
identification of plaques obtained from male E. coli, did 
not detect F-specific phages in feces from cows, pigs and 
humans. Osawa et al. (33) likewise did not detect F-specific 
RNA phages in birds, pigs, and cows. They were detected 
in 2% of the feces from humans and 2% of horse feces. 
Havelaar et al. (16, 21) used the host strain WG49 and 
found similar results: only the feces of boiler chickens 
were relatively constant sources of F-specific phages. The 
concentration of F-specific RNA phages is relatively low 
(10 1 to 10 3 /g) in free-roaming or wild animals (13). The 
highest numbers are in animal feces from animal hus¬ 
bandry operations (up to a mean of 10 6 /g). Hence, feces from 
wild animals do not appear to be an important source of 
F-specific RNA phage. 

In contrast, domestic sewage is an abundant source of 
F-specific RNA phages. The concentration of F-specific 
RNA phages is somewhat lower than that of somatic 
coliphages by a factor of 2 to 8 and ranges from 400 to 
40,000 per milliliter (19: figure 45-1). Similar numbers 
are found in wastewater from hospitals, pig slaughter¬ 
ing operations, and poultry processing plants (18). The 
higher numbers of F-specific RNA phages found in 
wastewater than feces suggest that these phages must 
be able to multiply in sewage. Multiplication of bac¬ 
teriophage GA in pasteurized sewage and river and/or 
seawater has been demonstrated at a temperature of 
20°C, but only if the host strain had been pre-grown 
in broth at 37°C (20). Hence, replication seems to be 
restricted to environments with direct fecal contamina¬ 
tion. As phage and host bacteria are present at con¬ 
centrations of only about 10,000 per milliliter, it also is 
not likely that multiplication will take place at environ¬ 
mental sites other than the sewage system. Thus, particu¬ 
larly because of their greater abundance within sewage, 
the presence of F-specific RNA phages in water is more an 
index of sewage pollution than just of fecal contami¬ 
nation (23). 


Bacteriophages Infecting 

Bacteroides fragilis 

Bacteroides fragilis is an anaerobic bacterium that has as 
its major ecological niche the human intestinal tract, and 
it has been argued that the same would be true for its 
phages (29). Phages belonging to the family Siphoviridae, 
with long, flexible tails (double-stranded DNA, long non- 
contractile tails, and capsids up to 60 nm) are the most 
common infecting B. fragilis (27). Phages infecting this 
bacterium attach to the cell wall. B. fragilis strains differ 
widely in the numbers of phages that they recover from 
domestic sewage. 

The isolation frequency of B. fragilis phages in human 
feces is relatively low: 0 to 15% of the samples (9). B. fragilis 
strain RYC 2056 has been reported to recover phage from 
28% of human stool samples (34). This host has also been 
shown to isolate phage in 30% of stool samples from 
pigs. Bacteriophages infecting B. fragilis strain HSP40 have 
been isolated from 10-13% of human stool samples, but not 
from animal feces (36). Tartera and Jofre (36) indicate that 
in positive stool samples the concentration of B. fragilis 
may range up to 10 8 per gram. Kai et al. (30) reported 
a range between 10 2 and 10 5 per gram, and irregular pat¬ 
terns of shedding when a single individual was studied 
over time. Bacteriophages infecting B. fragilis strain 
RYC2056 are found in domestic sewage (Europe, Africa, and 
America) in concentrations ranging from 10 2 to 10 3 per 
milliliter. This is one order of magnitude less than the 
concentration of F-specific RNA phages. The ratio of 
F-specific RNA and somatic coliphages is fairly constant 
in domestic sewage. 

Bacteriophages as Indexes 
of Sewage Pollution 

The preceding review suggests that at least three groups 
of bacteriophages infecting enteric bacteria are consistently 
found in sewage effluents. Thus, the three groups of 
phages may be considered as indexes of sewage pollution. 
On average, all three groups are more abundant in raw 
sewage than most enteric pathogens (27). No individual 
group fits the model of an ideal index organism (table 45-5) 
but they do aid in better defining the quality of water than 
traditional bacterial indicators. 


Bacteriophages as Indexes of Human 
Enteric Viruses 

The three groups of bacteriophages discussed have been 
proposed as potential indexes of the presence of human 
enteric viruses (i.e., enteroviruses, hepatitis A virus, 
Norwalk virus, etc.) on the basis of similar nucleic acid 
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Table 45-4 Commonly Used Organisms for Assessment of Water Quality 


Host 

Strain 

Strengths 

Weaknesses 

Stability 

5. typhimurium (F“) 

WC45 

Detects somatic Salmonella phages 

Shows only somatic attack 


S. typhimurium (F + ) 

WC49 

Reported to be selective for F-specific 

Not specific to F-specific RNA phages: 

An unstable strain that unpredictably 



RNA phages. Low rate of F~ plasmid 

also susceptible to attack by Salmonella 

loses its ability to plaque F-specific 



segregation. Kanamycin and nalidixic 

somatic phages and F-specific DNA 

phages 



acid resistant 

phages. Somatic Salmonella phages 
cause major interference 


£. coli (F ) 

CN, CN13 

Nalidixic acid resistant strain 



E. coli (F ) 

K-12 


Show somatic attack 


E. coli (K-12 F+) 

WG21, A/A,, Q13 


Susceptible to F-DNA phage attack. Also, 





plaque somatic T phages. Highly 
inefficient for enumeration of naturally 
occurring F-specific RNA phages 


E. coli (F ) 

B 


Produces plaque counts 5-6 times 





lower than other phages used for 
environment assay. Also, plaque somatic 

T phages 


E. coli (F ) 

C 

More plaques, highest counts. Nalidixic 


Plaque somatic T phages 



acid resistant. Most suitable for 





isolating DNA somatic phages, 
especially temperate phages 



E. coli (F + ) 

C-3000 


May be infected by some somatic 





coliphages. Majority of phages were 
somatic 


E. coli (K-12 F+) 

W3110 




£. coli 

R AMP, RR 

Ampicillin and streptomycin resistant. 

Low counts and susceptible to F-specific 

£. coli RR, stable 



Gives the highest % of detection for 
F-specific RNA phages 

DNA phage attack 



Modified from Leclerc et al. (32). 
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Figure 45-1 Summary of phage densities in fresh waters 
and sewage. Scatterplot (filled circles), regression line 
(unbroken line), and 95% confidence intervals (dashed 
line) of concentrations of enteric viruses in relation to 
concentrations of F-specific RNA bacteriophages in river 
and lake water in the Netherlands. For comparision, 
similar data on raw and treated sewage are also shown 
(open circles) (19). 

composition and morphology. To be useful as index organ¬ 
isms they should have an ecology similar to that of the 
human enteric viruses. Havelaar (15) listed a number of 
general criteria for the ideal enteric virus model in water: 
it should (i) occur exclusively and consistently in human 
feces, (ii) not occur in animal feces, (iii) not multiply in 
natural waters, (iv) outnumber human viruses in fecally 
polluted water by several orders of magnitude, (v) behave 
like human viruses in fecally polluted water treatment 
processes, and (vi) be detectable by simple, inexpensive, 
and rapid methods. 

All three groups of bacteriophages considered here 
occur in human feces. Somatic phages are frequently 
found in animal feces, but F-specific phages and phages 
of B. fragilis occur infrequently and in lower numbers in 
the feces of animals. Only the phages of the anaerobe 
B. fragilis seem unlikely to replicate in the environment. 
The literature indicates that the probability of replication 
in the environment is greater for somatic coliphages for 
than F-specific RNA bacteriophages. 


Few studies have been conducted on establishing a 
correlation between human enteric viruses and bacterio¬ 
phages. The most extensive data on bacteriophages in 
relation to enteric viruses in water was presented by 
Havelaar et al. (19). These authors collected data from 
surveys conducted over a period of 7 years in different 
types of treated and untreated wastewater, and in fresh 
waters from the Netherlands. The enteric virus con¬ 
centrations varied widely between 0.001 and 570 per liter. 
Bacterial model organisms (fecal coliforms and fecal 
streptococci) were significantly correlated with enteric 
viruses in river water and coagulated secondarily treated 
sewage effluent, but relatively low numbers were found 
in disinfected effluents and relatively high numbers in 
surface water open to nonhuman fecal pollution. F-specific 
RNA bacteriophages were also highly correlated with 
enteric viruses in all environments studied except for 
raw (untreated) and biologically (activated) treated sewage. 
Numerical relationships were consistent over a whole 
range of different environments; the regression equations of 
the F-specific RNA bacteriophages on enteric viruses in 
river water and lake water were statistically equivalent 
(figure 45-1). On average, a concentration of enteroviruses 
of 1 per 10 liters would correspond to a concentration of 
F-specific bacteriophage of 0.2 per milliliter. 

Jiang et al. (26) found that somatic coliphages could 
not be correlated with the presence of human enteric 
viruses in storm-water runoff impacting coastal waters of 
Southern California. The presence of human enteric viruses 
was significantly correlated with F-specific RNA bacte¬ 
riophages, however. Coliform bacteria, fecal coliforms, and 
enterococci, on the other hand, did not correlate with the 
presence of the human adenoviruses. 

Recent research also suggests that F-specific RNA 
bacteriophages are useful indicators of enteric virus 
contamination of shellfish (oysters, clams, mussels). In a 
study in England it was found that F-specific RNA 
bacteriophage concentrations in oysters were strongly 
associated with harvest-area fecal pollution and with 
shellfish-associated disease outbreaks (6). Bacteriophage 
contamination exhibited a marked seasonal trend that 


Table 45-5 General Features of Bacteriophage Proposed in Water Quality Assessment 



Somatic 

F-specific 

Phages of 

Feature 

coliphages 

RNA phages 

B. fragilis 

Homogeneity of the group 

+ 

+++ 

+++ 

Occurrence and concentration in human feces 

++ 

+ 

+ 

Occurrence and concentration in animal feces 

+++ 

+ 

+ 

Occurrence and concentration in domestic sewage 

+++ 

+++ 

+++ 

Probability of replication in the environment 

++ 

+ 

- 

Resistance to inactivation in the environment 

++ 

+ 

+++ 


Modified from Jofre (27). 

+++, high; ++, intermediate, + low; —, very low. 
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was consistent with the trend of oyster-associated gastroen¬ 
teritis in the United Kingdom. 

Stetler (35) monitored the occurrence of somatic coli- 
phages and enteroviruses in river water and at various 
stages in a conventional drinking water treatment plant. 
The coliphages could be detected in the source water by 
direct inoculation, and sufficient coliphages were detected 
in enterovirus concentrates to permit coliphage levels to 
be followed through the different water treatment proces¬ 
ses. Statistical analysis of the data indicated that entero¬ 
virus isolates were better correlated with coliphages than 
with total coliforms, fecal coliforms, fecal streptococci, or 
standard plate count bacteria. 

Jofre et al. (28) found a correlation between the numbers 
of B. fragilis bacteriophage, enteroviruses, and rotaviruses 
in sewage-polluted marine sediments. The ratios bet¬ 
ween the phages and either enteroviruses or rotaviruses in 
the marine sediments were similar to the ratios found in 
sewage, suggesting that they have a similar fate in marine 
sediments. Ganzter et al. (10) studied the relative occur¬ 
rence and concentration of enteroviruses — by cell culture 
infectivity and by polymerase chain reaction — and of coli¬ 
phages in treated secondary sewage effluents. They found 
a significant correlation between the concentration of 
somatic coliphages or B. fragilis phages and the presence of 
infectious enteroviruses or the presence of enterovirus 
genomes. 


Bacteriophages as Indexes of Human 
and Animal Pollution 

The identification of the sources of fecal contamination is 
important in watershed management. For example, pollu¬ 
tion of streams may originate from multiple sources, such 
as birds, septic tanks, or farm animals grazing in the 
watershed. Knowing the sources allows for more effective 
control measures. Serotyping of F-specific RNA bacte¬ 
riophages has allowed them to be divided into four groups. 
Serotypes II and III have mainly been isolated from human 
feces, whereas serotypes I and IV are usually found in ani¬ 
mal feces (7). It has also recently been shown that the 
subgroups can be grouped into four main genotypes, 
which, with few exceptions, show overall comparability 
with the serotypes. Probes for each genotype allow plaque 
hybridization and thereby study of the distribution of 
subgroups isolated from water samples. Subgroups II and 
III predominate in water samples contaminated with 
human fecal pollution and subgroups I and IV predom¬ 
inate in animal feces and in water contaminated 
with animal feces (2, 22). Griffin et al. (13) used geneo- 
typing to determine that animals were the source of fecal 
contamination in a Florida spring. 


Conclusions 

Bacteriophages of enteric bacteria have been proposed as 
indicators of the presence of fecal pollution, sewage, human 
enteric viruses, and human and/or animal fecal pollution. 
Each of the three groups of bacteriophages that has been 
studied has advantages and disadvantages (table 45-5). 
Somatic coliphages are the most abundant, and the methods 
for their detection are the most simple. However, they are 
a heterogeneous group and it has been reported that they 
may replicate outside the gut, although this has received 
only limited study. F-specific RNA bacteriophages are 
second in abundance and are a homogeneous group. Their 
method of detection is also simple and they are similar in 
morphology and nucleic acid content to many of the human 
enteric viruses. However, they do not survive as well as 
some enteric viruses in natural waters (27). The bacterio¬ 
phages of B. fragilis are present least often and when present 
occur in lower numbers than the other model bacte¬ 
riophages. They are a homogeneous group and there is 
no evidence that they replicate outside the gut. The fact that 
their method of detection requires anaerobiosis and that 
they are present in low numbers are drawbacks to their 
use as indicator organisms. 

No one universal indicator of microbial water quality 
has ever been found. It is clear that in the future multi¬ 
indicators of water quality are likely to come into use. The 
bacteriophages discussed here will likely play a role in 
water quality assessment and tracking of contaminant 
sources. The recent development of standardized methods 
(24, 25) should lead to more research on potential applica¬ 
tion of bacteriophages as model organisms. 
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The Use of Phage as Diagnostic Systems 

CATH REES 


T he most common use of bacteriophage in detection 
methodology is phage typing. When developing panels 
of phage which will discriminate between isolates of the 
same species on the basis of lytic spectrum, phage are 
chosen specifically because of their narrow (and there¬ 
fore discriminatory) host range. Since this is the use of 
phage most commonly encountered, a general impression 
has been created that phage have a narrow host range. 
Hence, one of the most commonly voiced criticisms of 
phage-based detection methods is that the host range of 
phage — and therefore any test — is limited, but this is 
far from the truth. When phage are selected for detection 
of species or of whole genera, those with the widest host 
range are chosen. A classic example of this is the phage 
Felix 01, first described as a broad-host-range phage infect¬ 
ing Salmonella enterica by Cherry et al. (8). Subsequent 
extensive studies of its host range have shown that it 
will usually infect more than 95% of Salmonella isolates 
(see 16 for a review). Similarly, in the Gram-positive genera, 
listeriaphage A 511 was reported to infect 95% of the two 
major serotypes of Listeria associated with human disease 
(serotypes l/2a and 4b; 17). More recently, systematic 
searches for broad-host-range lytic phage have demon¬ 
strated that they can be readily isolated from natural 
communities where the different bacterial genera are 
likely to be found (14). Hence, phage-based detection tests 
have been successfully developed, although few have met 
with commercial success. 

One reason for this lack of commercial success is that 
the tests are rarely 100% effective —there will always be 
mutants in a population that are phage-resistant. In the 
case of reporter phage assays, however (see below), it has 
been shown that host cells that will not propagate a 
phage will still support infection and expression of the 
reporter gene (i.e., the infection host range is wider than 
the replication host range; see 16), and this fact increases 
the versatility of reporter phage tests. It must also be 
remembered that even using classical microbiological iden¬ 
tification techniques, atypical isolates are often reported 
which would be recorded as false negatives, and in 


DNA-based detection methods there is often a concern 
about detecting both dead cells and cells containing cryp¬ 
tic genes. Therefore, the criticisms leveled at the phage- 
based assays do seem to be unfairly biased by this general 
belief that “phage have a narrow host range” and this has 
hampered their development. In practice, phage-based 
assays have had to offer some additional benefit before 
being adopted in favor of other rapid molecular-biological 
detection tests. Those that have been successfully developed 
are described below, along with some newer concepts that 
are still at the developmental stage that may represent 
future applications of phage in diagnostic systems. 


Reporter Phage 

Of the several different types of phage-based detection 
tests developed to date, the most widely described thus far 
has been recombinant reporter phage. The idea that 
phage could be used to introduce reporter genes into a 
bacterial cell had been well established by the development 
of phage Mud-lac as a genetic tool. However, it was Ulitzer 
and Kuhn (35) who first proposed the idea of using the 
expression of the reporter genes following infection by 
phage specifically as a rapid method of bacterial detec¬ 
tion (figure 46-1). The effectiveness of this methodology 
was first demonstrated by simply using phage-based clon¬ 
ing vectors engineered to contain the complete bacterial 
bioluminescence (lux) operon, and by using such constructs 
it was shown that as few as 10 Escherichia coli cells could 
be detected (34). In addition this group carried out ran¬ 
dom mutagenesis of wild-type phage genomes using TnlO 
transposons carrying the bacterial luciferase genes ( luxAB ) 
and recovered recombinant phage from biolumines- 
cent plaques. These simple constructs have been shown 
to successfully detect the presence of enteric pathogens 
in food samples (7, 15), but the most successful develop¬ 
ment of reporter phage has been for Mycobacterium 
tuberculosis. 
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2. Infection of target cell in sample 
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Figure 46-1 Reporter phage assay. 1: Reporter genes 
(e.g., bacterial luciferase, lux; firefly luciferase, luc; ice 
nucleation protein, ina) are introduced into phage genomes 
either by transposition or via homologous recombination 
between wildtype phage and cloned phage sequences 
carrying the reporter gene. Recombinant phage are purified 
from the primary phage lysate by plaque purification and 
phenotypic screening for the expression of the reporter 
gene. 2: Purified Reporter phage (e.g., lux phage) are 
used to infect target cells (drawn as large, shaded cell) in a 
mixed sample (other bacterial present represented by 
unshaded cells). Phage DNA is injected into the host cell and 
replication begins. 3: Phage genes are expressed along with 
the inserted reporter gene. The signal from the reporter gene 
(e.g., light) is detected indicating the presence of a cell 
which is sensitive to phage infection. 


The first reporter phage to be developed for the detec¬ 
tion of Mycobacteria was based on the lytic phage, TM4, in 
this case using firefly luciferase (luc) as the reporter gene 
(13). However, the rapidity of the lytic cycle meant that 
only limited amounts of luciferase protein were produced 
and the limit of detection using this construct was found 
to be 10 4 mycobacterial cells (13). A lytic phage had been 
chosen because when predicting the "ideal” features of 
reporter phage it was reasoned that a lytic phage would 
be the most effective, as all infecting phage would be 
actively replicating leading to high-level expression of the 
reporter genes. In fact, this did not prove to be a crucial 
factor in the successful development of reporter phage 
for Mycobacteria. 

When the same group constructed a reporter phage 
based on the temperate phage, L5, they found that a con¬ 
stitutive promoter was fortuitously generated when the 
phage integrated into the chromosome, resulting in pro¬ 
longed expression and accumulation of the luciferase 
protein and thus enhanced amplification of the signal 
generated. This allowed the limit of detection to be reduced 
to approximately KT cells after a 40 hour incubation period, 
or 10 3 cells after 20 hours (29). However, the limited host 
range of the L5 phage meant that it was not directly 
applicable for the detection of M. tuberculosis in a commer¬ 
cial test. Further work has been carried out to improve 
the TM4-based phage, including changing the site of 
insertion of the luc gene in the phage genome and isolat¬ 
ing various spontaneous mutants of the parent phage. 
As a result, derivatives of TM4 can now detect as few 
as 120 Mycobacterium bovis BCG after only a 12 hour incu¬ 
bation period (6), providing a dramatic improvement in 
detection times over standard culture methods for this 
slow-growing organism. These phage have been used in 
combination with further refinements of the assay condi¬ 
tions and shown to successfully detect Mycobacteria in 
smear-positive sputum samples within 24-48 hours (26). 

In each of the cases described above, a random 
approach was taken to construction of the reporter phage, 
with the formation of plaques by progeny phage produced 
following the recombination/transposition event being 
used to identify viable constructs. This approach has also 
facilitated studies of the phage genomes, leading to the 
identification of nonessential regions (see 25, 29). However, 
a more structured approach has also been employed by 
first characterizing regions of phage genome and then 
integrating the reporter gene into a specific locus. This 
approach was used by Loessner et al. (20) who introduced 
the luxAB genes into a late region of a virulent listeria- 
phage A 511 (see chapter 37 for review of Listeria phage 
as well as a brief discussion of this reporter phage work). 
A 511 is a member of the Myoviridae family and has a 
double-stranded DNA genome of approximately 116 kbp 
(18). Studies of the capsid proteins had allowed the 
late regions containing the genes for the major capsid 













































































704 PART VI: APPLICATIONS 


and tail sheath proteins to be identified (19). The luxAB 
genes were introduced into a cloned fragment of phage 
DNA to create operon fusions with the phage struc¬ 
tural genes without disrupting any of the original gene 
structures. Recombinant phage were recovered from 
lysates following infection of Listeria cells carrying the 
plasmid constructs with wild-type phage. These progeny 
phage were plated on a lawn of propagating bacteria 
and plaques screened for a bioluminescent phenotype 
when the lawn was exposed to aldehyde vapor. Double¬ 
recombinant phage, which contained only the additional 
luxAB genes and no vector sequences, were found at an 
unexpectedly high frequency (5 x 10 , 20). 

This reporter phage was evaluated for its ability to 
successfully detect the presence of Listeria monocytogenes 
in different types of food samples (21). From this study it 
was clear that, as for many rapid methods applied to 
food systems, the nature of the food matrix and the com¬ 
peting microflora is critical to the sensitivity of the test. 
Generally 1-10 cells per 25 g food sample was detectable 
in relatively simple foods matrices, but the threshold of 
detection was nearer 200 cells per 25 g in more complex 
samples. This study highlighted one of the main advan¬ 
tages of a phage-based test: the specificity of the host- 
phage interaction allows the detection of low numbers of 
specific bacteria in a mixed population without the need 
for purification to homogeneity by successive rounds of 
enrichment and selective plating. 

The strategy used to construct these Listeria lux 
phages requires a minimum amount of genome analysis 
to allow successful insertion of the reporter genes while 
retaining all phage essential functions. The strategy is 
generically applicable to all phage that can accommodate 
the insertion of the marker gene without exceeding 
the packaging constraints of the phage. In this case the 
luxAB reporter genes are 2.1 kbp in size. Similarly, the 
firefly luc gene is 2.3 kbp in size and the bacterial ice nuclea- 
tion protein, which has also been used to construct 
reporter phage (39), is even larger at 3.4 kbp. The directed 
approach to phage construction is only effective if the 
packaging constraints of the phage are sufficient to allow 
incorporation of the marker genes into the phage genome. 
If the marker gene is too large to be accommodated within 
the phage genome, then compensating deletion of nonessen¬ 
tial genes is necessary. However, such deletion requires 
further analysis of the phage genome so that appro¬ 
priate redundant regions can be identified. 

The choice of reporter gene relates to the sensitivity 
with which the signal generated is to be detected. For 
instance, it is reported that only one molecule of ice nuclea- 
tion protein needs to be synthesized to allow a positive test 
result (BIND assay: bacterial ice nucleation detection; 38). 
Hence this was used in a commercial test developed for the 


rapid detection of Salmonella as even low level expression 
is readily detected. The sensitivity of this assay comes from 
a 2-fold amplification of signal. Firstly, the reporter gene 
(ice nucleation protein) is expressed at high levels in the 
infected cell and these proteins become localized in the 
outer membrane. Secondly, the presence of the protein is 
detected using a phase-sensitive fluorescent dye which 
changes color as the buffer freezes. Hence, when samples 
are cooled, only buffer solutions containing transfected 
salmonellae freeze, causing the dye to change color. This 
assay allowed the detection of samples containing only 

2 cells/ml Salmonella enteritidis in a buffer system within 

3 hours. The sensitivity of the assay was further increased 
by using it in combination with sahnonellae-specific 
immunomagnetic bead separation (12). 

All the reporter phage described so far have been 
viable phage with a full complement of genes to allow 
phage replication. In light of public concerns about the 
release of recombinant microorganisms, use of these types 
of phage will be necessarily be restricted to special¬ 
ized laboratories. More recently, Kuhn et al. (16) have 
described the construction of “locked" phage that do not 
produce productive phage particles unless grown on a 
specially engineered host strain. First they isolated double 
amber mutants of the Salmonella phage Felix 01. These 
would not grow on Salmonella enterica LT2 and would only 
propagate on the sup + Salmonella strain K772 which can 
suppress such amber mutations. Being double amber phage 
mutants, reversion rates were very low (10 s -l() 9 per 
generation). Genetic analysis of the phage genome was 
carried out to identify a region of approximately 3 kbp 
which contained both essential and a nonessential genes. 
This region was replaced with a DNA fragment containing 
the bacterial luciferase genes (luxAB) and the supF gene by 
homologous double recombination. For the propagation of 
such phage, a sup~ host was used which contained the 
missing segment of the Felix —01 genome on a plasmid 
construct to provide the essential phage gene functions 
in trans. In this case, recombinant phage were only rarely 
recovered, which was thought to be due either to a limitation 
in the recombination event (only short homologous flank¬ 
ing regions were present in the plasmid construct) or pos¬ 
sibly to a feature of the phage biology if degradation of 
host (including plasmid) DNA occurred following phage 
infection. 

The phage produced in this way are genetically “locked,” 
as they need to maintain the InxAB-supF segment to 
allow them to propagate in wild-type Salmonella, and, due to 
the missing essential gene, will only successfully replicate in 
a strain containing the complementing plasmid. Following 
infection of other Salmonella strains, the luciferase genes 
are successfully expressed, allowing detection of the 
infected cell, but no viable phage particles are produced at 
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the end of the infection cycle. Although this method requires 
many more manipulations and far more extensive genetic 
analysis of individual phage, the fact that the reporter 
phage produced are effectively nonviable when used in the 
assay should go some way to address the concerns of 
environmental release of genetically modified organisms. 


Phage Amplification Assay 

The major draw back of the reporter phage technology is 
the fact that genetic engineering is required for each new 
phage to be employed. The difficulty of generating such 
phage means that new phage cannot quickly be developed 
for the detection of either new genera or new species/ 
subspecies. The emergence of new virulent subspecies 
(such as the appearance of Escherichia coli 0157:H7) or 
changes in the requirement to detect human pathogens 
(such as the identification of Campylobacter as a signifi¬ 
cant human pathogen) means that rapid detection methods 
need to be easily adapted to detect these new groups. 
A method that fulfils this requirement is phage amplifi¬ 
cation technology. This test uses only wild-type phage 
and the endpoint of detection is the formation of a plaque. 
No gene engineering is required. Therefore there are no 
containment concerns and the technology uses standard 
microbiological methods. Hence, staff carrying out routine 
testing do not need high levels of training in molecular 
microbiological methods and the methodology can easily 
be established as part of more traditional microbiological 
testing regimes. 

The assay was first described by Stewart et al. (31, 33) 
who named it termed phage amplification. A similar 
methodology has been described by Wilson et al. (22, 36) but 
termed phage amplified biologically assay (PhaB). The word 
“amplification” is used to describe these assays because 
the primary phage infection event which signals the pre¬ 
sence of the target bacterial cell is only detected when the 
progeny phage are allowed to replicate and develop into 
a plaque (figure 46-2). The test begins by adding phage spe¬ 
cific for the target bacterium to the test sample. Time is 
allowed for the phage to bind to the host cell and to enter 
the eclipse phase with the nucleic acid delivered into 
the host cell cytoplasm. At this point a virucide is added 
to destroy all those phage which have not successfully 
infected a bacterial cell. After a short period of incu¬ 
bation to allow complete killing of all the free phage par¬ 
ticles, the virucide is neutralized and the sample mixed 
with a laboratory strain (termed the “helper" bacteria) 
which is known to support the replication of the phage 
used. This mixture is then spread over the surface of 
an agar plate as a soft agar overlay. Once the replication 
of the eclipsed phage is complete, cell lysis of the target 


1 .Phage infection of target cell 



2.Destruction of remaining external phage 
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Figure 46-2 Phage amplification (or PhaB) assay. 1: Phage 
are used to infect target cells in sample and time is allowed 
for infection to proceed to the eclipse phase. 2: A virucide is 
added that will inactivate phage which have not infected but 
does not affect the viability of the host cell. Phage 
replication continues in the infected cells. 3: The virucide 
is neutralised and the infected cells are mixed with a large 
excess of helper cells, known to support the replication of 
the phage. The whole sample is mixed with soft agar to 
form a bacterial lawn. Replication of the infected cell is 
completed and cell lysis occurs. Released phage particles 
infect helper cells in the lawn and plaques are formed. 

Each plaque represents one infected cell in the original 
sample. 
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bacterium occurs and the progeny phage are released. 
These can then go on to infect the helper bacteria present 
in the lawn and a plaque develops at each site where 
originally there was an infected bacterial cell in the 
sample tested. Hence, each plaque represents the presence 
of one target bacterium in the original sample. 

This technology has been shown to be effective for 
the detection of a diverse range of bacterial pathogens 
such as Listeria, Campylobacter, Pseudomonas, Salmonella, 
and Escherichia coli, but to date it has been commer¬ 
cially developed only for the detection of Mycobacterium 
tuberculosis (FASTPIaqudFB ™; see 24). When testing for 
the presence of slow-growing organisms, such as M. tubercu¬ 
losis, the great advantage of these assays is that the 
target organism and the helper organism do not need to be 
the same, as long as the phage can infect both cell types. 
Thus, in the M. tuberculosis assay, the helper organism 
chosen is M. smegmatis, which can form lawns within 
12-48 hours rather than the 8 weeks required for M. tuber¬ 
culosis cells to grow into visible colonies. For fast-growing 
bacterial pathogens there is no obvious advantage to the 
technique, as colonies can develop using standard micro¬ 
biological detection methods within 12 hours. However, 
this does not take account of the time required to carry 
out the confirmatory tests which are often needed follow¬ 
ing the presumptive identification of a bacterial colony. 
When using the phage amplification procedure, much of 
this specificity can be built in by choosing phage with a 
suitable host range, thus achieving an overall reduction 
in detection time. 


Antibiotic Sensitivity Testing 

Both the reporter phage technology and the phage ampli¬ 
fication technique have been used to develop antibiotic 
sensitivity tests for the pathogens detected by these assays. 
The basis of this type of test is slightly different depending 
on which type of phage-based assay is used, but the end 
point assay is much the same. Simply: if the antibiotic is 
added and the target cell is sensitive to it, then the phage 
infection event is not detected as no signal is generated. 
In the case of the luc reporter phage test, no expression of 
the reporter gene is seen since the cells need to be alive 
and to carry out both protein and ATP synthesis necessary 
to produce bioluminescence from the firefly luciferase (6, 
13, 27). When using the phage amplification test, simply 
no plaques would be formed as primary replication of the 
phage in the target cell is inhibited (1, 10). Hence by com¬ 
paring the results of the assay with and without the 
addition of the antibiotic, the resistance (or sensitivity) of 
a bacterial pathogen can be rapidly determined without 
the need to wait for the growth of that organism following 
isolation. As before, the greatest advantage here is for 
slow growing or fastidious organisms that are difficult to 


culture, and accordingly tests have been developed and 
evaluated for determining the antibiotic resistance profile 
of M. tuberculosis strains isolated from patients prior to 
antibiotic therapy (2, 3, 23). 

Phage-Mediated Release of ATP 

One of the most commonly used rapid hygiene tests used 
is the bioluminescent determination of released bacterial 
ATP (see 30). These assays are based on the fact that there 
is a linear relationship between the number of photons 
produced by firefly luciferase and the number of ATP mole¬ 
cules hydrolyzed and the fact that the amount of ATP per 
bacterial cell in a given growth condition is quite constant 
(approximately 10~ g per cell). Many commercial com¬ 
panies produce kits which provide the luciferin and lucif¬ 
erase substrates to allow simple measurement of ATP levels 
using a luminometer, and also included are the lysing 
reagents required to break open the bacterial cells and 
release the intracellular ATP for measurement. This, then, 
is the limitation of the technique in that it gives only 
a measure of microbial load and no indication of the 
composition of the microflora. 

The ability of phage to lyse only specific cells within 
a mixed population has been used to add specificity to 
this test. This was first demonstrated for the detection of 
Listeria using both intact phage (28) and purified phage 
lysin (32). However, although both groups could add spec¬ 
ificity to the test, the problem exists that the practical limit 
of detection for these assays is approximately 10 4 bacterial 
cells; below this number the amount of light generated 
is too low to be detected against background. Obviously 
this does not give the level of sensitivity required when 
testing for pathogens, when tests must often demonstrate 
the absence of that organism from 25 g of sample. The sen¬ 
sitivity of the assay has been improved by amplification 
of the ATP signal through the measurement of released 
adenylate kinase (see 9) which amplifies the biolumines¬ 
cence signal due to its ability to continue generating ATP 
from ADP in a linear fashion. Using this technique, it has 
been shown that the limit of detection of phage-based 
assays can be reduced to less than 10 3 E. coli and Salmonella 
cells (5, 40), which compares favorably with other rapid 
methods of bacterial enumeration such as immunoassays. 

Future Horizons 

One of the most interesting new ideas put forward for the 
use of bacteriophage in detection assays is the dual phage 
technology (37; figure 46-3). In this case it is not the speci¬ 
ficity of the host-phage relationship that is exploited. 
Rather, this technology uses phage to report on the suc¬ 
cessful binding of an antibody to a specific antigen. The 
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2. Phage mixed with test sample 



3. Infection of host strain & growth 



Figure 46-3 Dual phage detection assay. 1: Two transducing 
phage are used carrying resistance genes (FG and R 2 ) to two 
different antibiotics (At and A 2 ). Different antibodies specific 
for the same antigen are bound to the phage capsids. 2: 
Phage-antibody complexes are mixed with the test sample 
and antibody binding allowed to occur. 3: Phage are mixed 
with a propagating strain at a low multiplicity of infection 
and plated on agar containing both antibiotics (At and A 2 ). 
Only cells which have received both resistance genes grow 
into colonies, indicating that the phage were physically 
linked together through antigen binding. 


phage used are transducing phage that can confer antibiotic 
resistance on a host cell following infection. The antibodies 
are linked to the surface of two different phage particles, 
each one conferring resistance to a different antibiotic. 
Antibody-linked phage are mixed with the sample and if 


the antigen is present then the two different phage types 
are physically linked closely together. Host cells for the 
phage are then added to the mixture and the cells plated 
out on media containing both antibiotics. A low multiplic¬ 
ity of infection of phage is used, so normally the proba¬ 
bility of coinfection by the two types of phage is very low. 
If large numbers of double-resistant colonies appear on the 
plates, then this indicates that the different types of trans¬ 
ducing phage had been located close together while linked 
to the same antigenic particle via the antibody moieties and 
this dual infection signals a positive detection event. 

In this case there is greater than a million-fold amplifi¬ 
cation of the signal, as a single infection event grows into 
a visible antibiotic-resistant colony containing more than 
10 8 cells, and thus the sensitivity of the test is great. The abil¬ 
ity to use two different antibodies in one detection assay 
also ensures a high degree of specificity. This dual phage 
assay has the advantage that no phage engineering is 
required and it can be applied to any type of antigen for 
which an antibody exists, not just for the detection of 
bacterial cells. Trials have been reported showing that the 
assay could sensitively detect human immunodeficiency 
virus particles in blood samples (S. Wilson, Microsens Bioph¬ 
age , UK, unpublished data) and it is also intended to develop 
sensitive dual phage assays for prions. 

This type of ingenious application of our understand¬ 
ing of phage biology suggests that many more applica¬ 
tions will appear in the future, and the more we understand 
the biology and genetics of phage, the more applications 
will emerge. For instance, the fact that phage have evolved 
to specifically recognize and bind to structures on the 
bacterial cell surface means that as our knowledge of 
structural biology increases we can expect to see further 
applications of this feature. A simple example of how this 
might be exploited is a report of the use of Salmonella 
phage Felix 01 (also referred to as phage “Sapphire") as a 
biosorbant to selectively separate cells from suspensions 
containing other related bacterial cell types (4). We can 
speculate that once the equivalent of the antibody minimal- 
recognition unit can be defined (see 11), phage receptors 
could be used to generate recombinant proteins, or protein 
complexes, with the desired binding properties — possibly 
even contributing to the development of novel families 
of affinity-binding proteins. 
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B acteriophages contribute to the virulence of many 
bacterial pathogens, largely by encoding the struc¬ 
tural genes for virulence factors. The most widely recog¬ 
nized phage-encoded virulence factors are bacterial 
exotoxins, which account for the characteristic clinical 
manifestations of a number of human diseases caused 
by bacterial infections. The previous edition of this chapter 
focused primarily on phage-encoded toxins, but in the 
intervening years two themes have emerged (169). First, 
phages are increasingly recognized for encoding genes 
that contribute to other aspects of bacterial pathogenesis 
in addition to toxin production. In fact, phage-encoded 
gene products contribute to virtually every facet of bac¬ 
terial pathogenicity, from attachment and invasion to 
immune evasion and transmission among humans. Second, 
while phages are like other mobile genetic elements in 
that they disseminate virulence genes among bacterial 
populations, phages have unique properties that enable 
them to contribute to bacterial pathogenesis by mecha¬ 
nisms other than transduction as well. For example, virion 
particles may themselves contain pathogenic compo¬ 
nents (12,13); and prophage induction, through gene ampli¬ 
fication, transcriptional upregulation, and phage-mediated 
lysis, can contribute to production and release of viru¬ 
lence factors from bacterial cells (167, 168). Owing to these 
developments, the contribution of bacteriophages to bacte¬ 
rial pathogenesis can no longer be conceived of simply 
as transduction of toxin genes that are regulated by the 
host bacterium. This chapter presents a summary of bac¬ 
teriophage involvement in bacterial pathogenesis, in which 
we consider (i) the nature of the bacterial virulence 
properties altered by phages, (ii) the basic mechanisms by 
which phages alter these properties, and (iii) the regulation 
of the phage-related virulence property by the phage and/or 
its host bacterium. 


Bacteriophages in the Pathogenesis 

of Gram-Positive Organisms 

Clostridium botulinum 

Clostridium botulinum is a Gram-positive, anaerobic, spore¬ 
forming bacillus responsible for botulism. This illness, 
characterized by flaccid paralysis, is acquired by ingestion 
of the preformed botulinum toxins, of which there are 
eight serologically distinct types (A, B, Ci, C 2 , D, E, F, and G). 
Each type consists of a neurotoxin (BoNT) and a hemag¬ 
glutinin (HA), as well as other nontoxic, nonhemaggluti¬ 
nin (NTNH) components (84). BoNTs are zinc proteases 
that act inside presynaptic cholinergic neurons at the 
neuromuscular junction to cleave proteins necessary for 
fusion of acetylcholine-containing vesicles with the cell 
membrane, including synaptobrevin, SNAP-25, and syn- 
taxin (117). Each BoNT type specifically cleaves one or 
more of these vesicle fusion proteins, impairing acetylcho¬ 
line release and thereby causing paralysis (117). 

The genes for the BoNT, HA, and NTNH toxin compo¬ 
nents are located within clusters that, in types C and D, are 
located on bacteriophages. The association of botulinum 
toxin with phages was first described in the 1970s, when 
nontoxigenic C. botulinum cells were rendered toxigenic 
following exposure to mitomycin-C-induced, cell-free lysates 
of toxigenic cultures (49, 79, 80). A toxin-converting 
phage, CE(3, was isolated from such lysates (49), and the 
toxin structural gene was identified in the CEP genome 
(57). Among the botulinum toxin types, only C and D 
have been shown to be phage-encoded; the remainder 
are either plasmid-encoded (type G) or assumed to be chro¬ 
mosomal (types A, B, E, and F) (84). 

Regulation of BoNT production is poorly understood. 
A DNA-binding protein, BotR, encoded by a gene (, botR ) 
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located in the toxin gene complex, was shown to increase 
toxin production when overproduced in C. botulinum (106). 
This protein bound specifically to the promoter regions 
involved in transcription of the NTNH, BoNT, and HA 
genes. Since botR is adjacent to the toxin genes, it is likely 
a phage-encoded factor in types C and D, but whether 
it plays a role in the transcriptional regulation of phage 
genes other than the toxin genes is unknown. Though early 
studies identified phage-related differences in the quantity 
of toxin production (66), the possibility of a relationship 
between phage transcriptional regulation and botulinum 
toxin production remains to be explored. 

Corynebacterium diphtheriae 

Corynebacterium diphtheriae is a Gram-positive bacillus that 
infects the upper respiratory tract and causes inflammation, 
necrosis, and the formation of an adherent pseudomem¬ 
brane. The organism elaborates the potent diphtheria 
toxin (DT), which is absorbed into the circulation and can 
cause a systemic, potentially lethal syndrome characterized 
by muscle weakness or paralysis and circulatory collapse 
due to myocardial dysfunction. DT is synthesized as a 
single, inactive polypeptide chain that is cleaved into two 
functionally distinct fragments: a C-terminal fragment, 
responsible for entry of toxin into the eukaryotic cell, 
and an N-terminal fragment, which catalyzes the ADP- 
ribosylation of eukaryotic elongation factor 2 (EF2) (40). 

Phage conversion of DT production by C. diphtheriae 
was discovered in 1951 by Freeman, who demonstrated 
that stable lysogeny of C. diphtheriae with a phage, desig¬ 
nated (3, was associated with virulence in a guinea pig 
intoxication model (55). It was subsequently proposed 
that toxin conversion by P could be due to transduction (7), 
a hypothesis that was proven by studying recombina¬ 
tion between phage P and a related nontoxigenic phage, y, 
with a host range different from that of p. Since recombinant 
phages possessing the host range properties of y and the 
toxin-determining property of P could be isolated, toxigeni- 
city was proven to be a discrete genetic attribute of P (61). 
In 1969, the tax locus of the phage was mapped (71), and 
definitive proof that the structural genes for the toxin 
were located on the phage genome was finally obtained 
in 1971, when nitrosoguanidine-induced phage mutants 
that produced serologically reactive but truncated, nontoxic 
DT were described (155). The tox gene is located near one 
end of the phage genome, adjacent to the attachment site 
(71, 96, 109, 144, 145), leading to the hypothesis that 
the gene was acquired by the phage during an aberrant 
excision event (96,129). 

Regulation of toxin production by C. diphtheria was 
studied as early as the 1930s, when the negative correlation 
between environmental iron concentration and DT produc¬ 
tion by C. diphtheriae was described (101,130). The molecular 
mechanism of the regulation of DT production by iron 


concentration has been elucidated more recently. The iron- 
dependent repressor DtxR, encoded on the C. diphtheriae 
chromosome and postulated to be a global low-iron- 
response regulator, was shown to repress expression from 
the diphtheria toxin promoter (25, 140) by directly 
binding to an adjacent operator (141). Barksdale et al. 
discovered that UV light, a potent phage-inducing agent, 
greatly enhanced diphtheria toxin production, but only 
at low iron levels (6). This result suggested that prophage 
induction and replication could amplify DT production, but 
that tox transcription relies on the iron concentration 
being low enough that the DtxR repressor is inactive. These 
authors’ conclusion that DT production depends upon acti¬ 
vation of the latent DT-encoding prophage seems unlikely 
since DT production occurs during all phases of the 
phage life cycle (59,107,108). While these latter observations 
argue against the necessity of prophage induction for DT 
production, they do not negate the possibility that pro¬ 
phage induction may contribute to toxin production. It is 
possible that regulation of DT production integrates envi¬ 
ronmental stimuli through at least two pathways: prophage 
induction and the DtxR repression system. The importance 
of these factors to DT production by C. diphtheriae in 
humans remains relatively unexplored. 

Staphylococcus aureus 

Staphylococci are Gram-positive cocci that cause a wide 
range of diseases, including endocarditis, pneumonia, 
suppurative infections such as skin abscesses and septic 
arthritis, and diseases attributable to exotoxin production. 
Suppurative disease is characterized by formation of abs¬ 
cesses, composed largely of staphylococci and neutrophils, 
which are attracted by staphylococcal components. The 
nonsuppurative staphylococcal diseases include food 
poisoning, caused by the ingestion of preformed enteroto- 
xins, as well as toxic shock syndrome (TSS), an acute, poten¬ 
tially lethal illness marked by fever, rash, desquamation, and 
hypotension that is caused by the toxic shock syndrome 
toxin (TSST). Many virulence factors produced by S. aureus 
have been isolated and characterized, although it has gener¬ 
ally been difficult to establish the role of individual factors 
(47). Nearly all staphylococci produce degradative enzymes 
that damage human tissues, and some strains produce a 
nonenzymatic fibrinolysin that acts as an anticoagulant 
via activation of plasminogen. Some strains produce toxic 
exoproteins, including hemolysins, the staphylococcal 
enterotoxins, the exfoliative toxins, the Panton-Valentine 
leukocidin (PVL), and TSST. A number of these virulence 
factors, including the fibrinolysin and most of the exotoxins, 
are associated with staphylococcal phages. 

The staphylococcal exfoliative toxins are responsible 
for the staphylococcal scalded skin syndrome, which 
primarily occurs in children. There are two distinct forms 
of exfoliative toxin, ETA and ETB, which are not directly 
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cytotoxic but rather cause lysis of intercellular attach¬ 
ments in the epidermis, with subsequent blister formation. 
The gene encoding ETA was thought to be chromosomal 
until recently, when an ETA-encoding phage, <j)ETA, was 
isolated from S. aureus, and a number of clinical S. aureus 
isolates were found to have phage gene sequences near the 
toxin gene (eta) (179). The complete (j)ETA genome was 
sequenced, and the eta gene was found near the att site 
of the phage (179). The gene for ETB resides on a family 
of plasmids (23). 

PVL, produced by most strains of S. aureus, specifi¬ 
cally attacks neutrophils and macrophages (23). The toxin 
is composed of two protein components that are produced 
and secreted separately but act synergistically to cause 
lysis of leukocytes, possibly by altering their permeability 
to cations. Phage conversion of PVL production was demon¬ 
strated as early as 1972 (158), and analysis of sequences 
adjacent to the PVL genes ultimately led to the isolation 
of a PVL-encoding phage, named <j>PVL (86, 87). At least 
one other leukocidin-encoding phage has been isolated 
from a clinical S. aureus isolate (120). 

Although best known for causing vomiting during 
food poisoning, the staphylococcal enterotoxins have also 
been shown to have superantigen properties (see below), 
and have been implicated in TSS. At least one staphylo¬ 
coccal enterotoxin gene — sea or entA (enterotoxin A) — is 
phage-encoded (17, 32, 39). The enterotoxin E ( entE) gene 
was cloned and sequenced: although no enfE-encoding 
phage could be detected, the gene was found to be ampli¬ 
fied in the presence of UV light and mitomycin C, suggesting 
that it too could be encoded by an inducible prophage 
(41). The enterotoxin B ( entB) gene was identified on a 
discrete 26.8 kb chromosomal element that has not been 
further characterized (83). 

TSST, consisting of a single 22 kDa polypeptide chain, 
is one of a number of related toxins capable of producing 
TSS. The most serious consequences of TSS result from 
systemic shock, which TSST may produce by one or 
more of three proposed mechanisms: superantigenicity, 
enhancement of endotoxin activity, or direct activity on 
endothelial cells (47). The staphylococcal TSST gene, tstH, 
is present on the bacterial chromosome within a 15.2 kb 
mobile genetic element designated staphylococcal patho¬ 
genicity island 1 (SaPIl) (100). SaPIl is mobilized at high 
frequency by the generalized staphylococcal transduc¬ 
ing phage 80a 1 and depends upon 80a for excision, replica¬ 
tion, and encapsidation into 80 a-like phage particles, 
a relationship reminiscent of the interaction between coli- 
phages P4 and P2 (100) (see chapter 26). 

Some staphylococci produce a recently discovered 
chemotaxis inhibitory protein (CHIPS) that binds to and 
attenuates the activity of the neutrophil receptors for com¬ 
plement and formylated peptides (160, 161). This function 
is proposed to protect S. aureus from neutrophil-mediated 
killing, an important host defense against staphylococci. 


The gene for CHIPS ( chp ) has been shown to reside on a func¬ 
tional quadruple-converting phage that, in addition to 
transducing the chp gene, also transduces the staphyloki- 
nase ( sak) and enterotoxin A (sea) genes, and eliminates 
P-hemolysin production (160). The latter effect presumably 
occurs via insertional inactivation of the hlb gene as is also 
achieved by lysogenization with another phage, cj>13 (37, 38). 
Staphylokinase, or fibrinolysin, is a potent nonenzymatic 
catalyst of the conversion of plasminogen to plasmin, a 
clot-dissolving protease. The genes for staphylokinase were 
cloned from a phage genome (139), and production of staphy¬ 
lokinase was previously shown to be associated with lyso¬ 
genic conversion by phages other than the CHIPS-encoding 
phage (32,92,174). 

Regulation of phage-encoded virulence factors in S. 
aureus is poorly characterized, although a number of 
phage-associated staphylococcal virulence genes, includ¬ 
ing tst, entA, and eta, are regulated by chromosomal loci 
that coordinate expression of catabolic and secreted 
proteins at particular stages of growth in laboratory 
culture. These loci include agr, active during the shift to 
post-exponential growth phase, as well as sar and sae, 
each of which affects the expression of multiple exopro¬ 
tein genes (125). Additionally, several environmental 
signals affect the expression of virulence genes indepen¬ 
dently of these loci (34, 125). The role of phages in regu¬ 
lating staphylococcal virulence factor production has not 
been extensively studied, although enterotoxin A produc¬ 
tion varied with different encoding phages, suggesting 
that phage factors or processes distinct from the sea 
gene and its associated promoter could influence toxin 
production (24). 

A major obstacle in the treatment of staphylococcal 
disease is the tendency of staphylococci to develop resis¬ 
tance to antibiotics, generally via alteration of the target 
molecule of the antibiotic or via acquisition of genes for 
antibiotic efflux or inactivation (133). Acquisition of anti¬ 
biotic resistance genes by staphylococci occurs primarily 
via transduction and conjugation (53). Although no exam¬ 
ples of phage-encoded antibiotic resistance genes are 
known, phages may play an important role in the mobiliza¬ 
tion of resistance plasmids via generalized transduction 
and via a poorly understood process termed phage-mediated 
conjugation, which is mechanistically distinct from trans¬ 
duction (95,103). 

Streptococci 

Group A Streptococci 

Streptococci are Gram-positive cocci responsible for a 
number of human diseases, including streptococcal pharyn¬ 
gitis, scarlet fever, impetigo, and bacterial endocarditis. 
In addition, group A streptococci (GAS) infections can 
lead, via poorly understood mechanisms, to post-infectious 
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syndromes including acute rheumatic fever, rheumatic 
heart disease, and acute glomerulonephritis. Like that of 
staphylococci, the pathogenicity of streptococci is multi¬ 
factorial, involving both bacterial structural components 
and exoproteins elaborated by the bacteria. Since lysogeny 
is very common among GAS (113) and transduction is 
the only known naturally occurring mechanism of hori¬ 
zontal gene exchange among then, it has been proposed 
that phages play an important role in periodic genetic 
shifts in GAS that may explain variation in the clinical 
features of infection (30, 90, 113). Furthermore, a number 
of specific virulence properties of GAS are influenced 
by bacteriophages. 

The streptococcal pyrogenic exotoxins (SPEs) are 
related proteins (SpeA, SpeB, SpeC) released by GAS that 
cause fever and toxic shock syndrome, and enhance the 
activity of endotoxin (111). These toxins also lead to the 
characteristic scarlet fever rash, mediated by delayed-type 
hypersensitivity in the skin and nonspecific T cell stimula¬ 
tion (superantigenicity) (111). As early as 1927, Frobisher 
and Brown described experiments in which nontoxigenic 
streptococcal strains acquired the ability to produce SPE 
following exposure to filtered supernatants of cultures of 
toxigenic streptococci (56). Much later, this conversion 
was demonstrated to be mediated by an SpeA-encoding 
bacteriophage (181). The structural genes for SpeA (85,172) 
and SpeC (60) were subsequently found to be located on 
the genomes of temperate phages, while SpeB is believed 
to be encoded by the streptococcal chromosome (113). There 
is evidence that toxin-encoding phages may play a direct 
role in regulating SpeA and SpeC production. Zabriskie 
(181) demonstrated enhanced phage and streptococcal 
pyrogenic exotoxin A (scarlatinal toxin) production by GAS 
in response to UV light (181), suggesting that prophage 
induction could contribute to SpeA production. Recently, 
Broudy et al. discovered a soluble phage-inducing factor 
(SPIF), elaborated by human pharyngeal epithelial cells, 
that induced an SpeC-encoding phage (60) and resulted 
in increased toxin production by GAS (26). Although the 
identity of the SPIF molecule and the mechanism by 
which it induces phage and toxin production remain to be 
explored, these findings imply a role for prophage induction 
in toxin production by GAS. 

Many strains of streptococci also produce hyaluroni- 
dases, enzymes capable of hydrolyzing hyaluronic acid, a 
component both of the bacterial capsule and of human 
connective tissue. In some cases, hyaluronidase genes are 
located on phage genomes (77), and hyaluronidase is 
incorporated into the virion particles (10, 11), possibly 
aiding the phage in capsule penetration during infection 
of or release from the streptococcal cell. Whether the 
phage-associated hyaluronidase activity, including that of 
the virion particles themselves, aids the bacterium in its 
spread through or destruction of human connective tis¬ 
sues is unknown, although antibody to phage-encoded 


hyaluronidase is detectable in the serum of patients with 
GAS infections (63). 

The M protein, which is a major cell surface antigen of 
GAS and is its principal virulence factor, confers resistance 
to phagocytosis (54). Growth under conditions known to 
cure bacteria of plasmids or phages led to a reduction 
in M protein expression by GAS (36), suggesting that a 
mobile genetic element could be involved in M protein 
regulation. Subsequently, Spanier and Cleary described a 
temperate bacteriophage, SP24, that, upon lysogenization, 
greatly enhanced M protein production by streptococci 
(149). These investigators proposed that a hypothetical 
phage-encoded gene, mprA, upregulates expression of the 
chromosomal gene encoding the M protein. 

Temperate GAS phages have also been observed to play 
a role in the transfer of antibiotic resistance through 
transduction of resistance genes (30, 76, 154), although 
as in staphylococci there is no evidence that resistance 
genes are phage-encoded. Similarly, streptolysin S, a 
P-hemolysin with lytic activity against erythrocytes and 
leukocytes, was shown to be transduced by a derivative 
of phage A25 (147), although subsequent studies have 
identified a chromosomal locus responsible for streptolysin 
S synthesis (124). 

Streptococcus mitis 

The viridans streptococci, including Streptococcus mutans 
and Streptococcus mitis, colonize the oropharynx and can 
cause infectious endocarditis after gaining access to the 
bloodstream. A critical early step in the pathogenesis of 
infectious endocarditis is bacterial adhesion to the endo¬ 
cardium, platelets, and fibrin, which together form the 
characteristic vegetations on affected heart valves. Two 
S. mitis proteins, encoded by an inducible prophage (SMI), 
are found in the S. mitis membrane and are important 
in adhesion of S. mitis to platelets (12, 13). These proteins, 
PblA and PblB, resemble phage capsid components and 
are constituents of SMI phage particles (13). It remains 
unknown whether the membrane-bound or the virion- 
associated forms of PblA and PblB, or both, mediate 
bacterial adhesion to platelets. UV light was shown to 
increase expression of the platelet-binding loci of S. mitis 
(12, 13), raising the possibility that prophage induction, 
perhaps by agents in the blood or endocardium, plays a role 
in expression of PblA and PblB in vivo. 

Streptococcus pneumoniae 

Streptococcus pneumoniae is a common cause of pneu¬ 
monia, otitis media, meningitis, and septicemia. One of 
the many factors contributing to the virulence of this 
organism is a cell wall degradative enzyme, autolysin 
(15, 16). The pathogenic mechanisms of autolysin are not 
well characterized, but may involve release of pneumococcal 
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virulence factors or structural components. The majority 
of pneumococcal clinical isolates contain multiple auto- 
lysin-encoding loci, and many pneumococcal prophage 
genomes hybridize with an autolysin gene (lytA) probe, 
suggesting that autolysins may actually be phage lysis 
proteins (134). In fact, a recently sequenced temperate 
pneumococcal phage encodes a gene with approximately 
90% identity to lytA (58). Phages may thus contribute to 
the as yet poorly defined role of autolysin in pneumococcal 
virulence. 


Bacteriophages in the Pathogenesis 

of Cram-Negative Organisms 

Bordetella avium 

Bordetella avium, which is genetically related to B. pertussis, 
causes avian bordetellosis, a highly morbid pertussis¬ 
like illness in young turkeys (146). B. avium shares with 
B. pertussis several virulence factors, including dermo- 
necrotic toxin, tracheal cytotoxin, hemagglutinin, and fim¬ 
briae. B. avium also encodes pertussis toxin (PT). Unlike in 
B. pertussis, however, PT in B. avium is encoded by a tem¬ 
perate generalized transducing phage, Bal, although B. 
avium has not been shown to produce measurable amounts 
of PT (142). It was speculated that PT could be produced as a 
late phage gene product and released via phage-mediated 
lysis (159), although mitomycin C treatment does not 
induce the Bal prophage or PT production by B. avium 
(152), and there is no evidence to date that Bal plays a role 
in B. avium pathogenesis (143). 

Escherichia coli 

Escherichia coli, a normal inhabitant of the human gas¬ 
trointestinal tract, causes a variety of human diseases 
including urinary tract infections, pneumonia, meningitis, 
septicemia, and a wide range of gastrointestinal diseases. 
The numerous virulence factors of E. coli range from 
structural features (endotoxin, capsules, fimbriae, and 
adhesins) to exotoxins, which are particularly important 
in gastrointestinal disease and include the heat-stable and 
heat-labile enterotoxins. A subset of diarrheogenic E. coli, 
including E. coli 0157:H7, produce a group of related toxins 
called Shiga toxins (Stx). These toxins cause severe clini¬ 
cal manifestations in addition to diarrhea (22, 127, 164), 
including hemorrhagic colitis and the hemolytic uremic 
syndrome (HUS), a potentially fatal systemic syndrome 
characterized by acute renal failure, microangiopathic 
hemolytic anemia, and thrombocytopenia. There are two 
major classes of Stx (Stxl and Stx2), each of which is 
composed of a pentameric B subunit involved in transport 
into the eukaryotic cell, and a monomeric enzymatic 


A subunit that cleaves 28 S rRNA (1), leading to inhibition 
of protein synthesis. 

Stx production by E. coli provides the most compel¬ 
ling known example of virulence factor regulation by the 
life cycle of a bacteriophage. The Shiga toxin genes are, 
in all known cases, closely associated with bacteriophage 
sequences, and are often present on functional lambdoid 
prophages (32, 42, 89, 116, 122, 156, 166). Mitomycin C and 
other phage-inducing antibiotics greatly enhance toxin 
production by shiga-toxin-producing E. coli (STEC) (2, 3, 69, 
74, 75), and a mechanism for this relationship was sug¬ 
gested by genome sequence analysis of several Stx-encoding 
phages, including H-19B (Stxl) and 933W (Stx2), which 
revealed that the stx genes are located within the tightly 
regulated late operon (105, 115, 121, 132, 180). Thus, the 
toxin genes could be regulated as late phage genes, tran¬ 
scribed in concert with the phage lysis and morpho¬ 
genesis genes at the appropriate time following prophage 
induction. 

Recently, the importance of prophage induction in Stx 
production and release has been established, as have the 
mechanisms underlying this relationship. Stx2 production 
by lysogens of phage <f>361 relied almost exclusively on 
the late phage promoter (p R ) and its associated antiter¬ 
minator gene (Q), both in vitro and in a mouse intestinal 
infection model (168), indicating that Stx2 production 
ultimately depends upon prophage induction. In contrast, 
in lysogens of phage H-19B, late phage transcription was 
not essential for Stxl production (167), due to the presence 
of an iron-regulated promoter (pstxi) adjacent to the stxq 
coding sequence (28, 29, 44, 81). Nevertheless, phage 
replication and phage-regulated transcription initiating at 
the p R and p R ' promoters each contributed to the increase 
in Stxl production observed following prophage induc¬ 
tion (167). Moreover, phage-mediated lysis controlled the 
duration and therefore the total amount of Stxl produc¬ 
tion, and allowed for Stxl release from the cell (167). 
The central role of prophage induction in Stx production 
and release could explain why prophage-inducing antibio¬ 
tics, often used in the treatment of STEC infection, are 
epidemiologically associated with adverse clinical outcomes 
(27, 31, 91,131,175). Furthermore, human cells (neutrophils) 
and molecules released from them (H 7 O 2 ) can induce 
Stx-encoding phages and subsequent Stx production (165), 
raising the possibility that agents present in the human 
body may induce Stx-encoding prophages and thereby 
contribute to STEC pathogenicity. 

A number of other coliphages encode factors known 
or suspected to have a role in E. coli pathogenesis, includ¬ 
ing phage X, which has long been known to enhance the 
serum survival of its E. coli lysogens (118). Barondess and 
Beckwith identified a novel outer membrane lipoprotein 
(Bor) that is encoded by X secreted during lysogeny, and 
important for serum resistance of X lysogens ( 8 ,9). A second 
^-encoded outer membrane protein (Lom) ( 8 , 135) may 
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contribute to the adhesive properties of E. coli, since the 
reported advantage of X lysogens observed in adhe¬ 
sion to buccal epithelial cells was absent in a A, lysogen bear¬ 
ing a lonv.UnphoA fusion (128). The contribution of lorn 
and bor to E. coli pathogenesis has not been tested in 
animal models of infection. X -like phages isolated from 
pathogenic E. coli have also been shown to encode a hemoly¬ 
sin (Ehly) (18-20,151), although the significance of Ehly as 
a virulence factor is uncertain. 

Pseudomonas aeruginosa 

Pseudomonas aeruginosa is an important opportunistic path¬ 
ogen, commonly responsible for pneumonia and infec¬ 
tions of burns, wounds, and the urinary tract. Among the 
many factors contributing to the virulence of P. aeruginosa 
is a pore-forming cytotoxin (CTX) (4, 21, 126) that confers 
virulence in an animal model (5) and is encoded by a 
phage, c()CTX (67). In this phage, the cos site lies between 
the nearby toxin gene and the attP site, implying that 
a more complicated mechanism than aberrant excision 
must account for the acquisition of the toxin gene by 4>CTX 
(67, 68, 119). Little is known regarding the control of 
CTX production; however, Xiong et al. suggested that <j)CTX 
integration into the chromosome, which brings an active 
promoter into continuity with the ctx genes, could be 
critical for CTX production (178). Consistent with this 
hypothesis, CTX production in cultures of (j>CTX-infected 
P. aeruginosa temporally followed c[)CTX integration (178). 
This requirement would make 4>CTX unusual among toxin¬ 
encoding phages, as there are no other cases in which 
phage integration is known to upregulate toxin gene 
transcription. 

Other phages may contribute to the pathogenesis of 
P. aeruginosa in other ways. Several phages alter the chemi¬ 
cal composition and structure of the 0-antigen of P. 
aeruginosa upon lysogenization (14, 33, 46, 48, 70, 94, 97, 
102, 104). These changes alter bacterial susceptibility to 
phage superinfection (70), and in principle could aid the 
organism in evading immune defenses (136). Another 
P. aeruginosa phage, FIZ15, has been shown to enhance 
adhesion of the organism to buccal epithelial cells (157). 

Salmonella enterica 

Salmonella species are responsible for a number of clini¬ 
cal syndromes, the most important of which are gastro¬ 
enteritis and typhoid fever; the pathogenesis of the latter 
involves a number of phage-related virulence properties. 
Certain Salmonella species, including S. typhi, penetrate the 
intestinal epithelial barrier and subsequently invade the 
subepithelial tissues and regional lymphatic structures, 
including Peyer’s patches. Following invasion, these Salmo¬ 
nella species have the ability to survive within phago¬ 
cytes and spread hematogenously to involve lymphoid 


structures throughout the body. Subsequently, organisms 
can reinfect the intestinal tract, especially the gall bladder, 
and can cause late complications, including intestinal 
perforation, hemorrhage, and abscess formation. Invasion 
by Salmonella requires a type III secretion system encoded 
by Salmonella pathogenicity island 1 (SPI1), which injects 
effector proteins directly into the cytoplasm of human 
host cells (73). SopE, a protein that activates human Rho 
GTPases and facilitates entry of Salmonella into tissue 
culture cells, is one such effector protein (64, 65, 176), 
and is encoded on SopE(j), a P2-like phage (114) (P2-like 
bacteriophages are reviewed in chapter 25). This is a 
remarkable instance of two mobile genetic elements 
(SPI1 and SopE(j)) cooperating to enhance the virulence of 
Salmonella. 

Following translocation of the intestinal epithelium, S. 
typhimurium preferentially localizes to Peyer’s patches, 
where a gene ( gipA ) encoded in the late operon of 
another Salmonella phage, Gifsy-1, is specifically upregu- 
lated (150). A gipA null mutant was selectively impaired 
for growth in the Peyer’s patch following oral inoculation, 
but not impaired in general growth, attachment, invasion, 
or virulence following intraperitoneal injection (150). 
The O-antigen is thought to be an important virulence 
determinant of Salmonella, determining its susceptibility 
to complement and other serum proteins, and to phago¬ 
cytosis (136). In Salmonella, the 0-antigen structure and 
composition are altered by phage-encoded enzymes 
(137, 138, 177). Since the structure of the O-antigen is one 
determinant of the degree of susceptibility to phagocytes 
and serum factors (82,99), O-antigen alteration by bacterio¬ 
phages could be an important determinant of Salmonella 
virulence. 

Inside the phagocyte, Salmonella is subjected to oxida¬ 
tive stress within eukaryotic organelles that produce reac¬ 
tive oxygen intermediates, including the superoxide radical. 
Salmonella resists oxidative damage using enzymes such 
as superoxide dismutase, which catalyzes the conversion 
of superoxide ion into hydrogen peroxide and molecular 
oxygen and has been implicated in Salmonella patho¬ 
genesis (45, 50). Figueroa-Bossi and Bossi showed that 
a Salmonella superoxide dismutase, SodC, is encoded by 
Gifsy-2, a functional bacteriophage capable of transducing 
sodC (52). Curing S. typhimurium of this phage resulted 
in attenuated virulence in a mouse infection model (52). 
Hydrogen peroxide was a highly effective inducer of 
Gifsy-2 (52), suggesting a possible relationship between 
SodC activity (which results in hydrogen peroxide produc¬ 
tion) and Gifsy-2 induction. 

Shigella 

Shigella species cause dysentery, a disease characterized 
by abdominal pain, tenesmus, and bloody diarrhea. Shigella 
organisms attach to and penetrate colonic epithelial cells, 
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inside which they multiply and spread to adjacent cells, 
leading to inflammation, ulceration, and frequent discharge 
of stools containing blood, mucus, and pus. Two virulence 
factors of S. dysenteriae are affected by phages: the O-antigen 
and Shiga toxin. The O-antigen likely contributes to bacte¬ 
rial evasion of human immune defenses by conferring resis¬ 
tance to complement activation and phagocytosis (136). 
Phage-encoded enzymes alter the O-antigen, as in P. aerugi¬ 
nosa and Salmonella (above), which presumably benefits 
the phage by conferring superinfection resistance on the 
host bacterial cell (62, 110). However, since O-antigen 
alteration is a mechanism of immune evasion by bacteria, 
these phages may also make an indirect contribution to 
the Shigella pathogenicity (136). 

Shiga toxin may damage intestinal epithelial cells by 
inhibiting protein synthesis in a manner identical to that of 
the Shiga toxins of E. coli (above). McDonough and Butter- 
ton sequenced an approximately 32 kb segment of the 
S. dysenteriae type 1 chromosome and found that stxAB, 
long considered to be chromosomal genes, are embedded 
within phage-like sequences, including a gene with more 
than 95% homology with the S lysis gene of phage 933 W 
located 3' relative to stxAB (112). A large number of insertion 
sequences were also located near the stx genes, implying 
that they are present on a prophage rendered defective by 
the insertion sequences. 

Vibrio cholerae and Vibrio parahaemolyticus 

The curved Gram-negative rod Vibrio cholerae causes 
cholera, a severe, sometimes lethal, diarrheal disease that 
often occurs in epidemics. V. cholerae colonizes the small 
bowel and elaborates cholera toxin (CT), which accounts 
for the profuse watery diarrhea characteristic of cholera. CT 
consists of a pentameric B subunit, which binds to a recep¬ 
tor on epithelial cells, and a monomeric enzymatic A 
subunit, which transfers ADP-ribose to a G protein that regu¬ 
lates the activity of adenylate cyclase (171). The resulting 
increase in intracellular cAMP concentration leads 
to voluminous secretion of chloride and water from epithe¬ 
lial cells into the small bowel lumen. V. cholerae coloniza¬ 
tion of the small intestine requires TCP, a bundle-forming 
pilus, the production of which, like CT, requires the tran¬ 
scriptional activator,ToxR (148). 

CT is encoded within the genome of CTX<f>, the first 
filamentous bacteriophage shown to participate in the 
lysogenic conversion of its host bacterium (170). Unlike 
the well-characterized F-pilus-specific filamentous coliph- 
age M13 (see chapter 12), CTXcj) integrates into the 
V. cholerae genome. Interestingly, the V. cholerae receptor for 
CTXcj) is TCP, and since TCP is produced during V. cholerae 
intestinal colonization, it has been proposed that CTXcj) 
infection of V. cholerae occurs most efficiently within 
the human intestine (170). Thus, lysogenic conversion 
of an ancestral TCP + nonlysogenic V. cholerae strain to 


toxigenicity (creating a fully pathogenic V. cholerae ) may 
have occurred within the human host. Transcription of 
ctxAB, which encodes CT, is thought to depend largely on 
two activators: the chromosome-encoded ToxR and the 
Vibrio pathogenicity island (VPI)-encoded ToxT (148). The 
interaction between ToxT and the ctxAB promoter provides 
another example of interplay between mobile genetic 
elements (VPI and CTXcj)) in bacterial pathogenicity, akin 
to that between SPI1 and SopEcj) in Salmonella. VPI has 
been proposed to correspond to the genome of a prophage, 
VPIcj) (88), although this hypothesis awaits confirmation. 
Phage-initiated transcription at a promoter other than 
the ctxAB promoter can also direct ctxAB messenger 
RNA production (98), but the significance of this ToxR- 
and ToxT-independent ctxAB transcription for V. cholerae 
pathogenicity remains unknown. 

Before the discovery of CTXcj), two gene products (Ace 
and Zot) encoded within the 6.9 kb CTXcj) genome were 
proposed to have enterotoxic activity. The Ace protein (153) 
is now thought to be a CTXcj) minor coat protein (170); 
therefore, CTXcj) secretion from V. cholerae may constitute 
a mechanism for delivery of toxic virion particles to the 
intestinal epithelium. Zot (51) is an ortholog of the M13 
pi protein (93), which is thought to act in phage secretion 
(43). The contribution to V. cholerae pathogenicity of these 
two enterotoxins, now known to be essential phage gene 
products, awaits further study. 

Vibrio parahaemolyticus, like V. cholerae, causes diar¬ 
rhea, although its virulence factors are not well character¬ 
ized. Until recently, V. parahaemolyticus strains were not 
thought to give rise to pandemics. However, one V. para¬ 
haemolyticus serovar, 03:K6, may currently be causing a 
pandemic of diarrheal disease in North America and Asia 
(35). The filamentous phage f237, related to CTXcj) but 
lacking ctxAB, is closely associated with these V. parahae¬ 
molyticus isolates (78), and ongoing studies are aimed 
at determining whether f237 played a direct role in the 
emergence of this pathogen. 

Bacteriophages in the Pathogenesis 

of Mycoplasma 

Mycoplasma arthritidis 

Mycoplasma arthritidis causes arthritis in rodents, and 
the arthritogenesis of M. arthritidis was found to be 
directly correlated with lysogeny of this organism by a 
bacteriophage designated MAV1 (163). The approximately 
16 kb genome of this phage was sequenced, leading to 
the identification of an open reading frame, vir, that encodes 
a product with a lipoprotein signal sequence (162). The 
contribution to M. arthritidis pathogenicity of this gene, 
which is encoded in the opposite orientation relative to 
all other MAV1 genes and transcribed during lysogeny, 
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Table 47-1 Contribution of Bacteriophages to the Virulence of Bacterial Pathogens 


Bacterial Pathogen 

Mechanism 

References 

Gram-positive-pathogens 

C. botulinum 

Botulinum toxin is phage-encoded 

57 

C. diphtheriae 

Diphtheria toxin is phage-encoded 

71, 155 

S. aureus 

Fibrinolysin is phage-encoded 

139 


Staphylococcal enterotoxins are phage-encoded 

17, 32, 39 


Staphylococcal exfoliative toxins are phage-encoded 

179 


Toxic shock syndrome toxin is encoded by Sapll, a mobile pathogenicity island transduced at high 
frequency by cf>80 

100 


Generalized transduction contributes to horizontal transmission of staphylococcal antibiotic 
resistance genes 

53 

S. pyogenes (GAS) 

Plyaluronidase is phage-encoded 

77 


Phages encode CHIPS, a phagocytotoxin 

160 


Phage 4>PVL encodes the Panton-Valentine leukocidin 

87 


Lysogeny upregulates the antiphagocytic M protein 

149 


Streptococcal pyrogenic (erythrogenic, scarlatinal) exotoxins are phage-encoded 

60, 85, 172 


Generalized transduction contributes to horizontal transmission of streptococcal antibiotic resistance 
genes 

30 

S. pneumoniae 

Phages encode autolysins 

58, 134 

S. mitis 

The SMI-encoded PblA and PblB surface proteins promote adhesion to platelets 

12, 13 

P. aeruginosa 

Phage FIZ15 promotes adhesion to buccal epithlial cells 

157 

V. cholerae 

The toxin-coregulated pilus may be phage-encoded 

88 

Gram-negative-pathogens 

B. avium 

Pertussis toxin is phage-encoded in B. avium 

159 

E. coli 

The ^-encoded bor gene confers a survival advantage in animal serum 

8 


The X -encoded lorn gene promotes adhesion to buccal epithelial cells 

8, 128, 135 


The Shiga toxins are phage-encoded and their production is regulated via prophage induction 

72, 123, 167, 168, 173 

P. aeruginosa 

Phages are associated with O-antigen variation 

70 


Pseudomonas cytotoxins are phage-encoded 

67 

S. enterica 

Phage SopEcj> transduces a type III secretion system effector that promotes entry into epithelial cells 

114 


Phage Gifsy-1 encodes gipA, a gene that enhances survival in the Peyer’s patches 

150 


Phage Gifsy-2 encodes SodC, a superoxide dismutase 

52 


Phages are associated with O-antigen variation 

137, 138, 177 

S. dysenteriae 

Phages are associated with O-antigen variation 

62, 110 


The Shiga toxin genes are associated with phage sequences, probably a defective prophage 

112 

V. cholerae 

Cholera toxin is phage-encoded 

170 


The Ace and Zot enterotoxins are essential phage proteins 

170 

V. parahaemolyticus 

The pandemic serovar 03:K6 is associated with phage f237 

78 

Mycoplasma 

M. arthritidis 

The phage MAV1-encoded vir gene may contribute to arthritogenesis 

162, 163 



71 8 PART VI: APPLICATIONS 


remains to be explored. MAV1 and other mycoplasma 
phages are reviewed in chapter 40. 

Conclusions 

Bacteriophages play an important role in bacterial 
pathogenesis through a variety of mechanisms (table 47-1). 
They promote pathogen evolution by serving as vectors 
for the dissemination of virulence genes. Some phages 
regulate, through the process of prophage induction, when 
and where these virulence genes are expressed. Phage 
particles may contain structural components that act in 
pathogenesis, and some phages alter bacterial cell surface 
antigens in ways that could contribute to bacterial evasion 
of human immune defenses. Understanding new ways 
in which bacteriophages contribute to bacterial pathogen¬ 
esis is a fertile area for future research, and such knowl¬ 
edge could suggest novel strategies for the prevention and 
treatment of bacterial infections. 
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T he ability of bacteriophage (phage) to replicate 
exponentially and lyse pathogenic strains of bacteria 
suggests that they should play a vital role in our armamen¬ 
tarium for the treatment of infectious diseases. However, 
in spite of an initial enthusiasm, early clinical applications 
resulted in a negative shift of opinion concerning the thera¬ 
peutic potential of phage. There are a number of factors 
that may have been responsible for this rejection of the use 
of phage as antibacterial therapeutic agents, particularly 
in countries that require certification based on the results 
of efficacy and pharmacokinetic studies in animals and 
humans. These factors include an initial lack of under¬ 
standing of the relatively narrow host range of phage and 
an inability to purify phage preparations from bacteria pro¬ 
ducts and debris. These contaminating materials often 
include bacterial exo- and endotoxins along with bacterial 
cellular components that tend to inactivate phage prepara¬ 
tions when they are stored without further purification 
(60). Another major factor that affected the development 
of phage therapy was the successful introduction of antibio¬ 
tics effective against a broad range of bacterial strains. 
With such antibiotics physicians could often successfully 
treat infections even before they determined the causative 
bacterial strain. The narrow host range of phage made 
duplication of such a practice questionable at best. 

Despite the current widespread use of antibiotics, an 
ever-increasing prevalence of antibiotic-resistant bacteria- 
suggests that phage therapy merits reconsideration. In addi¬ 
tion, knowledge gained since the initial phage therapeutic 
applications, concerning phage genetics, physiology, and 
molecular biology, should provide beneficial information 
for current efforts to develop phage into a reliable thera¬ 
peutic agent. However, there are still a number of gaps in 
our knowledge, including information on the interaction of 
phage with mammalian organisms including the reac¬ 
tions of innate and active mammalian immune systems to 
phage. We also need to develop methods to understand 
and improve on the pharmacokinetic behavior of phage. 


In addition, we need techniques to facilitate the rapid deter¬ 
mination of the best phage strain to use for each specific 
clinical infection before we can make full use of the 
therapeutic potential of phage. 

As the discovery of antibiotics was one of the major 
events that overshadowed the development of phage as 
an antibacterial therapeutic agent, we have divided the 
following historical introduction into pre-antibiotic and 
antibiotic eras. See chapter 1 for a discussion of phage 
history from the perspective of the development of the 
discipline of molecular biology. 


Pre-antibiotic Era Phage Therapy 

Shortly after phage was discovered by Twort in 1915 (77), 
Felix d'Herelle championed the concept of using them to 
treat bacterial infections (19). d’Herelle's first efforts were 
concentrated on the treatment of avian typhosis in chick¬ 
ens and shigella dysentery in rabbits. Following his reported 
success in these applications of phage as antibacterial 
therapeutic agents, he extended the use of phage to the 
treatment of bacillary dysentery in human infections. In 
pursuing his phage therapy studies he traveled around 
the world, stimulating both basic and clinical phage 
research (75). While many of d’Herelle’s ideas concerning 
phage have proven correct, one idea that he proposed 
may have resulted in some of the clinical failures of phage 
therapy. d'Herelle suggested that there might be only one 
bacteriophage that could adapt to many bacterial hosts. 
He even named this perceived phage capacity, the “unicity 
of the bacteriophage” (80). This belief that there was but 
one phage that could adapt to all bacterial strains may have 
led clinicians at that time to use inappropriate phage 
strains. In 1928, d'Herelle was appointed to a professorship 
in the bacteriology department at Yale University School 
of Medicine (80). During his tenure at Yale he conducted 
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a number of phage studies including efforts to “adapt” 
Staphylococcus phage to resist inhibition by factors in 
human serum (20). Following his resignation from his 
position at Yale in 1934, d’Herelle played a major role in the 
establishment of an institute in Tbilisi, in Soviet Georgia. 
This Institute produced large quantities of phage for anti¬ 
bacterial therapy during and immediately following the 
Second World War and today is still actively pursuing 
phage therapy applications. 

Phage were also used as antibacterial therapeutic 
agents in Poland and France, and they were distributed 
for clinical practice in the United States by a number of 
major pharmaceutical companies. In 1932, one European 
laboratory was reported to be distributing 50 liters of 
phage a day (39). Phage preparations were marketed by 
three major US Pharmaceutical companies. Eli Lilly & Co. 
sold “Staphylo jel,”and other phage “jel”-labeled products, for 
streptococcus and colon bacillus. Phage products, marketed 
by E. R. Squibb & Sons and the Swan-Myers division of 
Abbott Laboratories included a bacteriophage filtrate pre¬ 
paration for staphylococcus and a combined bacterio¬ 
phage filtrate preparation for staphylococcus and colon 
bacillus, respectively (74). Problems with some of these 
commercial phage preparations were found to be caused 
by the use of organomercury compounds as preservatives. 
Such preservatives often resulted in loss of phage activity 
during storage. Variations in the phage strains from one 
lot to another but marketed under the same label served to 
further undermine the confidence of clinicians who might 
otherwise have used these preparations (74). Additional 
problems in early therapeutic applications of phage may 
have occurred because of the lack of refrigeration or 
adequate phage purification. Most phage preparations 
intended for therapeutic applications were purified only 
by passing the lysate through filters fine enough to remove 
the host bacteria. While such purification reduces the 
risk of bacterial infections, it does not remove bacterial 
debris that can include bacterial endo- and exotoxins, and 
these contaminants can result patient morbidity and 
mortality. Phage interactions with this material during 
storage of phage preparations can also result in loss of 
activity. These were some of the issues that resulted in the 
establishment of a Council on Pharmacy and Chemistry by 
the American Medical Association. This Council concluded 
that the phage therapy was plagued by a lack of basic 
understanding and standards for purity or effectiveness 
(23). A recent review by Ho (32) presents additional infor¬ 
mation pertaining to this period. 

Researchers interested in phage therapy were also 
concerned that even when phage with demonstrated 
in vitro antibacterial effects were used for clinical infections, 
factors present in the serum, tissue debris, cellular compo¬ 
nents, etc., would inhibit bacterial lysis by phage. They 
suggested that given these issues any positive effect of 
phage therapy on the course of an infection was probably 


due to stimulation of specific antibacterial immunity, 
and/or nonspecific phagocytic activity (22, 24). 

By 1937, the state of affairs had deteriorated to the 
point that researchers such as Asheshov and his collea¬ 
gues stated that “no satisfactory evidence has yet been 
obtained that a phage exerts any significant effect on 
the course of an experimental infection" (5). Despite their 
skepticism, Asheshov and his colleagues recognized that 
part of the problem was associated with the difficulty of 
repeating experimental results due to failures of experi¬ 
mental design. By using a specific strain of bacterium and 
phage strains that were shown to be active in vitro 
against the selected bacterial strain, these researchers 
demonstrated the ability of the phage to rescue mice 
injected intraperitoneally with lethal concentrations of 
“Bacter. Typhosum" They were able to rescue a significant 
number of mice even when the phage injections were given 
as late as 4 hours after the bacterial injection. Without 
phage treatment most of the animals died within 24 
hours. Furthermore, they demonstrated that heat-killed 
phage and nonspecific phage strains were not able to res¬ 
cue the animals. These experiments clearly showed that 
phage that display antibacterial activity against a parti¬ 
cular bacterial strain in vivo may serve as an antibacterial 
therapeutic agent for the treatment of an animal with a 
systemic infection with that bacterial strain. They also 
demonstrated that the antibacterial effect of the phage is 
due to the physiological functions of the phage since heat- 
killed and nonspecific phage could effect no such rescue of 
infected animals. 

Dubos et al. (22), at Yale University, addressed the 
concerns that factors in blood, tissue, and bile might 
interfere with the lytic activity of phage and that such 
interference would render phage impotent as antibacterial 
therapeutic agents. They demonstrated that such inter¬ 
ference effects, if present, were minimal as they were able 
to rescue mice infected intracerebrally with Shigella dy sente- 
riae by injecting anti-Shiga phage into the general circula¬ 
tion. In these experiments they also observed a correlation 
between an increase in phage titer observed in the blood of 
infected animals and their rescue, suggesting that the 
rescue of the animals was due to phage functions, includ¬ 
ing replication, and that interfering factors, if present, were 
insufficient to inhibit the beneficial effects of phage as a 
potential antibacterial therapeutic agents (see chapter 5 for 
a discussion of phage population growth and its impact on 
bacterial population density). 

These positive developments came too late to generate 
much enthusiasm for phage therapy in the Western world. 
By this time antibiotics were proving superior because of 
their activity against a broad range of bacterial hosts and 
in most cases their robust storage characteristics. While 
the use of phage therapy waned in Western medical 
practice it continued to be employed in Eastern Europe and 
parts of Asia. This was due in part to the restriction of 
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information concerning the development of antibiotics, par¬ 
ticularly penicillin, by the British and American govern¬ 
ments in the early phase of the Second World War. The 
Soviet and Polish phage literature concerning the thera¬ 
peutic use of phage has been extensively reviewed (1). 
The reviewers note that the Soviet and Polish medical 
researchers studied the efficacy of phage therapy almost 
exclusively by qualitative clinical assessments of patients. 
Details of phage dosages and clinical criteria were reported 
in a “sketchy” manner. For these reasons, most of the 
studies from Eastern Europe will not meet current stan¬ 
dards in countries that require certification based on the 
results of efficacy and pharmacokinetic studies in animals 
and humans. 

Antibiotic Era Phage Therapy 

It was 30 years after the Dubos' work before animal studies 
were again performed to investigate the efficacy of phage 
to treat bacterial infections. This hiatus was due in large 
part to the success of antibacterial chemotherapeutics 
such as the sulfonamides, discovered in 1935, followed by 
the antibiotics during and following the Second World 
War. However, some of the resistance may have been due 
in part to the effect of theoretical explanations that arose 
to explain the perceived failure of phage therapy. These 
are best captured by the following statement from Stents 
1963 book, Molecular Biology of Bacterial Viruses (73): 

Just why bacteriophages, so virulent in their antibacte¬ 
rial action in vitro, proved to be so impotent in vivo has 
never been adequately explained. Possibly the immediate 
antibody response of the patient against the phage 
protein upon hypodermic injection, the sensitivity of the 
phage to inactivation by gastric juices upon oral adminis¬ 
tration, and the facility with which (as we shall see 
presently) bacteria acquire immunity or sport resistance 
against phages, all militated against the success of phage 
therapy. 

As suggested by Stent, antibodies, produced by the 
adaptive immune system, may be of importance for the 
inactivation of phage, particularly in individuals repeatedly 
exposed to a specific phage strain. This ability of phage to 
provoke an antibody response in normal individuals has 
been used over the past three decades by Ochs and his 
colleagues, who employ the phage c()X174, to study normal 
individuals and patients with immune deficiencies (56, 57). 
They demonstrated that the adaptive immune system of 
normal individuals who are naive to a particular phage 
strain requires a few days to develop a detectable anti¬ 
body level and about 2 weeks for a maximal antibody 
response. However, the innate immune system has been 
shown to be able to rapidly eliminate phage administered 


systemically. This was investigated in an experiment, 
published in 1973, that demonstrated that phage injected 
systemically in germ-free mice were removed rapidly, by the 
liver and spleen (reticuloendothelial system (RES) of the 
innate immune system), from the circulatory system even 
though these mice displayed no antibody activity against 
the phage (30). The authors suggested that this rapid elimi¬ 
nation of phage in intact animals “may explain the limited 
success reported for the phage treatment of infectious 
diseases." They also suggested that the rapid rate of phage 
elimination could be slowed by overwhelming the RES with 
colloidal particles. 

Recently a less intrusive method was discovered to 
circumvent this rapid systemic elimination of phage. This 
method employs genetic selection to find mutant phage 
strains with reduced rates of clearance by the RES by 
employing serial passage techniques for the selection of 
such variants (50). Infected mice treated with these long- 
circulating phage variants recovered more rapidly and 
their symptoms were less severe than those mice treated 
with wild-type phage. In these experiments, long-circulating 
lytic-mutant X and P22 phage (reviewed in chapters 27 and 
29, respectively) were developed to treat mice infected intra- 
peritoneally with Escherichia coli and Salmonella typhi- 
murium, respectively. In the case of X phage, the mutants 
displayed a single base change in the capsid E gene that 
resulted in the substitution of a lysine for the normal glu¬ 
tamic acid residue in this capsid protein. Given the large 
number of copies of this protein in the phage capsid, the 
resulting alkaline shift associated with the mutant phage 
may have been associated with the capacity of this phage 
to remain in the circulatory system for an extended period. 

Another concern, that “bacteria acquire immunity or 
sport resistance against phage,” as expressed by Stent, was 
addressed experimentally by Smith and Huggins (67). As 
predicted by Stent, Smith and Huggins found phage- 
resistant bacterial mutants following the use of a single 
dose of an E. coli K1 strain-specific phage to treat mice that 
had been infected, either intramuscularly or intracere- 
brally, with a K1 strain of E. coli. Interestingly, the phage 
treatment of these infected mice still proved to be more effi¬ 
cacious than treatment with antibiotics. In fact, phage- 
resistant bacterial mutations occurred with less frequency 
following phage therapy than the antibiotic-resistant 
mutants that appeared after antibiotic therapy. Further¬ 
more, the phage-resistant bacterial mutants that were 
observed were found to be lacking the K1 capsule (the recep¬ 
tor for phage attachment) which is associated with a loss 
of pathogenicity. 

Smith and Huggins’ experiments, demonstrating that 
phage injected intramuscularly could be used to treat an 
intracerebral infection, corroborated the findings of Dubos 
et al. (22) obtained 39 years earlier, in which phage injec¬ 
ted into the circulatory system were successfully used to 
treat an intracerebral Shigella dysenteriae infection in mice. 
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These observations may help to mitigate concerns that 
phage, due to their larger size, will not be as effective in 
treating tissue infections as the lower molecular weight 
antibiotics (42). 

Experiments by Soothill (71) showed that the lytic devel¬ 
opment of the phage is a critical parameter in the ability 
of phage to control bacteria in vivo. In separate experi¬ 
ments, animals were given intraperitoneal injections of 
either Acinetobacter baumanii or Pseudomonas aeruginosa 
and then treated successfully using phage specific for 
each. However, in a similar experiment with Staphylococcus 
aureus, treatment with phage failed. Staphylococcus phage 
used were significantly less active in in vitro than the 
phages for the other organisms. 

Experimentally induced septicemia and meningitis in 
chickens and calves have also been successfully treated 
with phage (7). In these studies, E. coli K1 strains were used 
along with the Kl-specific R phage (the same phage strain 
as that used by Smith and Huggins). The phage was able 
to rescue chickens even when the administration was 
delayed until the onset of symptoms. As in the work by 
Dubos et al., 55 years earlier, in vivo phage multiplica¬ 
tion was found in the brains of infected animals, even 
when the phage were injected intramuscularly. It is not 
clear whether the capacity of the phage to cross the 
blood-brain barrier was due to effects from an inflam¬ 
matory response to the bacterial infection or whether 
phage can normally cross this barrier. It was also found 
that phage taken up by the spleen persisted in signifi¬ 
cant numbers for several days after injection, corroborating 
the findings of Geier et al. (30). 

Much of the interest in reviving phage therapy has 
been fuelled by the appearance of antibiotic-resistant 
bacteria. In this regard, Biswas et al. (10) showed that phage 
specific for vancomycin-resistant Enterococcus faecium 
could rescue mice that were infected intraperitoneally 
with bacteria. If titers of phage equivalent to the titers 
of infecting bacteria were given 45 minutes after infec¬ 
tion, 100% of the mice were rescued, and even when treat¬ 
ment was delayed for 24 hours, when the mice were 
moribund, 50% could still be rescued. Phage may also be 
used to treat antibiotic-resistant intracellular pathogens. 
Broxmeyer et al. (12) have demonstrated that it is possi¬ 
ble to use Mycobacterium smegmatis, an avirulent myco¬ 
bacterium, as a vector to deliver the lytic phage TM4 to 
treat intracellular mycobacterium infections (with either 
Mycobacterium avium or Mycobacterium tuberculosis) in 
macrophages. These examples of efforts to develop phage 
therapy represent cases where specific needs exist for the 
treatment of clinical infections that are not treatable by 
current methods. As the occurrence of drug-resistant 
pathogens increases, we expect to see an increase in efforts 
to develop phage into a viable alternative to antibiotics. 

All the work described above involved the treatment 
of systemic infections using phages injected either 


intramusculary or intraperitoneally. Phage have been 
demonstrated to be effective therapeutic agents for the 
treatment of nonsystemic infections. Several studies have 
shown that gastrointestinal infections can be treated by 
oral administration of phage. Smith and Huggins (67) and 
Smith et al. (68, 69) showed that various phages could 
protect calves, pigs, and lambs from gastrointestinal infec¬ 
tions of enteropathogenic strains of E. coli. In addition, 
Ramesh et al. (59) found that oral administration of a 
bacteriophage specific for Clostridium difficile could prevent 
ileocecitis caused by this organism in hamsters. Since C. 
difficile colonization of the gut is a common consequence 
of extended antibiotic treatment due to destruction of the 
normal gut flora, phage used in such a manner may prove 
to be useful in conjunction with antibiotic therapy. 

A study by Soothill (72) showed that bacteriophage could 
prevent destruction of skin grafts by Pseudomonas aerugi¬ 
nosa. This bacterium commonly colonizes burn wounds 
and is often difficult to treat with antibiotics. There is an 
immediate need for a treatment of this problem, and topical 
application of phages to burn wounds with Pseudomonas 
infections may be an attractive way to reintroduce phages 
into the modern clinical setting. 

Terrestrial animals are not the only candidates for phage 
therapy. Recent studies have shown that phage can be used 
to treat bacterial diseases of fish in aquaculture (53). In addi¬ 
tion, phages have recently been shown to be effective in 
the treatment of bacterial blight of geranium (26) and 
bacterial spot on tomatoes (25). 

Selection and Characterization of 
Therapeutic Phage Factors 

Phage are the most abundant entities on the planet (there are 
estimated to be more than l(r° phage particles (13); see, for 
example, chapter 33 for discussion of this abundance). 
However, only a few phage strains will prove to be effec¬ 
tive as therapeutic antibacterial agents. There are a num¬ 
ber of factors that can affect the therapeutic efficacy of 
phage chosen for use as antibacterial therapeutic agents. 
Studies of phage multiplication in bacteria in vitro may 
provide information as to bacterial host range and a good 
first approximation of whether the phage may be appropriate 
for a particular clinical infection. However, observation of 
phage multiplication in defined culture media does not 
take into account the interactions of the phage with the 
bacteria in the clinical environment. Bacterial gene expres¬ 
sion and phenotype may be affected by numerous vari¬ 
ables in the clinical milieu, ranging from differences in 
the basic nutrients to altered physiological parameters 
including pH and ionic strength. In addition, clinical 
infections have the added complication of interactions 
between the innate and active immune systems with both 
the infectious bacteria and the therapeutic phage. 
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While phage therapy, as described above, has had a relati¬ 
vely long, but checkered, history, careful scientific study 
of factors associated with therapeutic efficacy has yet to 
be carried out. 

Phage chosen as antibacterial therapeutic agents need 
to be well characterized, the genome sequenced, and 
much of the biology of the phage well understood before it 
is developed for therapeutic use. Phage host range, viru¬ 
lence, stability, interaction with both innate and active 
immune systems, as well as the phages possible capacity 
to lysogenize and transduce, must be understood to provide 
the greatest chance of efficacy and safety. 

In addition, therapeutic phage strains need to be tested 
and selected for their ability to function in the milieu of 
the human physiological systems. Interactions with both 
the innate and active immune system need to be mini¬ 
mized for phage strains that are employed for the treatment 
of systemic infections and pharmacokinetic parameters 
need to be determined. 

Host Specificity of Phage 

The host range of phage is generally narrower than that 
found in antibiotics selected for clinical applications. Most 
phages are specific for one species of bacteria and many 
are only able to lyse specific strains within a species (never¬ 
theless, see chapter 46). The limited host range of phage 
can be both an advantage and a disadvantage in phage 
therapy. 

The advantage of a narrow phage host range is that 
the use of such phage in antibacterial therapy results in 
less harm to the normal body flora and ecology. In contrast, 
antibiotics with their ability to affect a wide range of bacter¬ 
ial strains often disrupt the normal gastrointestinal flora. 


Such side effects of antibiotic therapy can result in oppor¬ 
tunistic secondary infections by organisms such as 
Clostridium difficile (8). While this type of side effect should 
not be a problem with phage therapy, the narrow phage 
host range does require a means of determining the specific 
phage strain needed for each infection that is to be treated. 
This requirement for the use of phage as an antibacte¬ 
rial agent presents two major problems in the current 
clinical setting. The first problem is the need to have a 
battery of well-characterized phages available for a broad 
range of pathogens; second, there must be a timely method 
available to determine which phage strain will be effec¬ 
tive for a given infection (as discussed below under “Methods 
for the Rapid Determination of Phage Specific for Infecting 
Pathogen"). 

Many different phage strains will need to be identified, 
characterized, and developed to cover even a proportion 
of the bacterial diseases that could be good candidates 
for phage therapy. However, using current molecular tech¬ 
niques it may be possible to enhance the host range of 
some phage, thus reducing the number of phages that 
need to be developed. For example, it has been found that 
coliphage Kl-5 is a “dual” specificity phage that encodes two 
different tail proteins allowing it to attack and repli¬ 
cate on both K1 and K5 strains of E. coli (62). One tail protein 
found on phage Kl-5 is a lyase protein, similar to that of 
phage K5 (specific for the K5 polysaccharide capsule) and 
a second tail protein found on this phage is an endosiali- 
dase similar to a tail protein found in phage K1E (specific 
for the K1 polysaccharide capsule). In addition, the genomic 
region encoding these proteins is almost identical to the 
genomic construct found in the Salmonella phage 
SP6 which codes for a protein that binds to the Salmonella 
O-antigen (figure 48-1) (63). 
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Figure 48-1 Phage tail protein genes. Diagram of tail protein genome-encoding regions for the coliphages K1, l<5, Kl-5, 
and the Salmonella phage SP6. All these phages share a similar promoter region and an intergenic region with a putative 
transcription terminator. This “modular” genomic construct suggests that a horizontal gene transfer mechanism for host 
range variation in nature that can be adapted for phage to be used as therapeutic antibacterial agents. These phages display 
additional qualities necessary for phages that will be used in antibacterial therapy: they produce progeny phage with a large 
burst size, and also show little if any loss of titer on storage. 
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The observation of a similar tail genome motif in both 
the Salmonella phage SP6 and the coliphage K1E, K5, and 
Kl-5 suggests that this genomic construct might serve in 
the development of a modular phage platform that could 
operate over a wide bacterial host range. The develop¬ 
ment and use of such a “modular" phage would also save 
considerable time and effort over that required for character¬ 
izing completely new phages for each bacterial strain. 

Other mechanisms have been found in nature to 
expand the bacterial host range of phage. These include the 
site-specific recombination systems that permit phage to 
switch between alternative tail fiber proteins (61) and the 
reverse transcriptase by which a Bordetella phage provides 
for variations in its tail fiber proteins (44). It may be possi¬ 
ble to adapt these mechanisms to extend the host range of 
therapeutic phages. 

Parameters besides tail fibers and their interaction 
with cellular receptors can be important for host specifi¬ 
city. Restriction/modification systems may limit the host 
range of phage in some bacterial strains. It may be possi¬ 
ble to address this problem by engineering phages with 
genomes that do not contain restriction sites recognized 
by the nonpermissive host. Alternatively, phages could be 
produced in bacterial strains that provide DNA modifica¬ 
tion^) that allow the phage to escape restriction in the 
targeted bacteria strain. Another approach that might be 
employed to address the restriction/modification problem 
when engineering a phage is exemplified by a mechanism 
used by the phage T 7. This phage expresses a gene, 0.3, early 
in the infection process that codes for a protein which is 
a potent inhibitor of type I DNA restriction and modifica¬ 
tion enzymes (6, 51). A construct containing this gene 
might be adapted for use in other phage strains or it may 
be possible to modify T7 to expand its bacterial host range 
for E. coli infections (phage T7 biology is reviewed in 
chapter 20). 

In some cases, phage may fail to replicate in a parti¬ 
cular host because they lack one or two genes essential 
for the replication of the phage. Such gene(s) can be identi¬ 
fied and then incorporated in the phage genome. For 
example, phage X does not normally replicate in Salmonella. 
However, when a X phage library containing copies of the 
E. coli genome were tested, it was found that X phage 
carrying the E .coli mis A gene could replicate in a Salmonella 
strain, provided that the receptor protein for X attachment is 
already expressed in the Salmonella strain (S. Adhya, unpub¬ 
lished observations ). 

In addition to the factors addressed above, bacteria 
grown with standard laboratory protocols may not behave 
the same when they are growing in the milieu of an infec¬ 
tion. This point was emphasized in a review by Hollon (33) 
in which he cited observations of Karakawa that Staphylo¬ 
coccus aureus rarely expresses the capsular polysac¬ 
charides found in clinical isolates when the bacteria 
are grown in the laboratory. Given such a change in the 


bacterial capsule, a phage discovered using bacteria grown 
in the laboratory may not be able to multiply in the same 
bacterial strain in an infected animal. In the early 
phage literature there are reports of body fluids (serum, 
pus, ascitic fluids, cerebrospinal fluid, urine, and bile) 
inhibiting the infectivity of phage against typhoid, colon 
bacilli, and staphylococci (14,17). More recently, it has been 
reported that phage infecting certain strains of E. coli that 
are not expressing the cell surface protein, Ag43, in stan¬ 
dard laboratory growth media, may be inhibited by concen¬ 
trations of bile salts similar to those found in the 
gastrointestinal tract (28). The Ag43 protein is a phase- 
variable protein whose expression is associated with E. coli 
biofilm formation (18). Recognition of these problems is 
important in isolating phage for clinical applications. 


Undesirable Phage Genes 

While phage can be used to treat bacterial infections 
they can also play a major role in bacterial pathogenesis. 
A number of phage genes have been discovered that 
encode toxins, or factors that enhance bacterial virulence. 
They may also contribute, through transduction, to the 
transmission of antibiotic resistance genes (81). It may be 
possible to reduce the occurrence of such adverse effects 
by sequencing the genome of phages of interest for thera¬ 
peutic applications and using this sequence information 
to search for homologies with known toxin genes, islands 
of pathogenicity or genes that foster integration of DNA 
into the bacterial genome. Known phage-encoded toxin 
genes are summarized in table 48-1; see also chapter 47 
for a broader discussion of the role of phage in bacterial 
pathogenicity. 

The presence of such toxin genes or genes with similar 
sequences can be found be found by searching phage 
genomes against GenBank online using the Basic Local 
Alignment Search Tool, BLAST (2). In addition, this 
approach can be used to search for drug-resistance genes, 
phage genomic-integration factors, or other potential genes 
that may increase the virulence of a bacterial strain. Such 
BLAST searches take into account only similarities to 
known genes and it can certainly be assumed that there 
are other, as yet unidentified toxins and potentially detri¬ 
mental genes that do not have sequence similarity to 
anything currently in the databases. However, knowl¬ 
edge of toxins, drug resistance, and other potentially trou¬ 
blesome genes is increasing rapidly as is the number of 
completely sequenced phage and bacterial genomes. For 
these reasons, such database searches will become increas¬ 
ingly useful and they should help to assure that phage 
chosen for use as antibacterial therapeutic agents are 
free of genes that might potentially damage bacterially 
infected humans, animals, or plants being treated with 
phage therapy. 
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Table 48-1 Phages that Carry Toxin Genes (adapted from 11) 


Bacteria 

Phage 

Gene 

Gene product/phenotype 

Escherichia coli 0157:H7 

933.H-19B 

stx 

Shiga toxins 


<|)FC3208 

hly2 

Enterohemolysin 


X 

lorn 

Serum resistance 


X 

bor 

Host cell envelope protein 

Shigella flexneri 

Sfi6 

oac 

O-antigen acetylase 


sfll,sfV,sfX 

gtrll 

Glucosyl transferase 

Salmonella enterica 

SopEcj> 

sopE 

Type III effector 


Gifsy-2 

sodC-l 

Superoxide dismutase 


Gifsy-2 

nanH 

Neuraminidase 


Gifsy-1 

gipA 

Insertion element 


£ 34 

rfb 

Glucosylation 

Vibrio cholerae 

CTXcj) 

ctxAB 

Cholera toxin 


K139 

glo 

G-protein like 


VPI<j> 

tcp 

TCP pilin 

Pseudomonas aeruginosa 

CTX(j> 

ctx 

Cytotoxin 

Clostridium botulinum 

Cl 

a 

Neurotoxin 

Staphylococcus aureus 

NA 

see,sel 

Enterotoxin 


<))13 

entA.sak 

EnterotoxinA, staphylokinase 


TSST-1 

tst 

Toxic shock syndrome-1 

Streptococcus pyogenes 

T12 

speA 

Erythrogenic toxin 

Corynebacterium diphtheriae 

p-phage 

tox 

Diphtheria toxin 


Pharmacokinetics of Phage Therapy 

Phage in Mammalian Host 

Pharmacokinetic data concerning phage therapy are still 
in a rudimentary state despite the long history of phage 
use and study. Many early clinical applications of phage 
therapy employed oral administration of phage prepara¬ 
tions with little or no effort to determine phage uptake 
or distribution. While oral administration may have 
diminished the possible side effects from contaminants, 
including endo- and exotoxins that are often present in 
filter-"purified” phage preparations, it may not have been 
the most effective route for the treatment of systemic infec¬ 
tions. Determination of the most effective therapeutic 
regime(s) for phage therapy requires pharmacokinetic 
information. 

There were some early practitioners of phage therapy 
who recognized the need to learn the fate of phage injected 
into animals. However, these early researchers generally 
employed qualitative methods and they only reported 
whether lysis had occurred following the incubation of 
ground-up tissue or drops of blood with the host bacteria 
in liquid media. Despite these limitations, such efforts led 
to the observation in 1921 that phage injected into the 
circulatory system of rabbits could still be found in the 
spleen long after no trace of phage could be found in other 
organs or in the blood (3). This finding was corroborated 


12 years later, in 1933, in an experiment in which, 3 days 
after the intravenous injection of a rabbit with phage, the 
animal was killed and the liver, spleen, and blood exam¬ 
ined for phage. The liver and spleen, “crushed to a pulp in 
a mortar,” and a sample of blood were independently incu¬ 
bated in growth media with the host bacteria. At that time 
phage could no longer be detected in the blood or the 
liver but it could be found in the spleen (24). One of the 
first quantitative studies of the fate of phage in animals 
was performed by Nungester and Watrous (55). They 
reported that following the intravenous inoculation of 
10 9 plaque forming units (PFU) of a Staphylococcus phage 
into albino rats, the titer in the blood dropped to 10 5 PFU 
in 5 minutes and about 4 x 10 1 PFU in 2 hours. This rapid 
elimination of phage from the circulatory system was 
attributed to the organs of the RES primarily the liver 
and spleen. 

In experiments using T4 phage as a probe of the 
immune system, Inchley (35) found that the liver phago- 
cytosed more than 99% of the phage within 30 minutes 
after inoculation and that it removed 12 times as much 
phage as the spleen, as measured by the uptake of 51 Cr- 
labeled phage. Additional studies demonstrated that the 
liver inactivated the phage at a higher rate than the spleen, 
as measured by PFUs of phage that could be detected in 
these organs and the rate of loss of m I-labeled T4 phage 
in these organs (T4 biology is reviewed in chapter 18). 
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Figure 48-2 The distribution of phage plaque forming units in mice following various routes of administration. This graph 
was adapted from data from a 1973 experiment in which germ-free mice were inoculated with a single dose of 2 x 10 12 
plaque forming units (PFU) of X phage. In these experiments oral administration of phage resulted in the detection of a 
systemic level of phage tissue titers that were 7 to 8 orders of magnitude lower than that achieved by systemic 
administration of phage (30). 


A study of the distribution of phage was reported in 
the previously mentioned study by Geier et al. (30) that 
employed germ-free mice. These mice had no detectable 
antibody activity to the X phage. Despite the lack of 
these antibodies the animals also displayed rapid elimina¬ 
tion of the phage from the circulatory system and reten¬ 
tion of active phage, as measured by PFU, in the spleen 
(figure 48-2). As there were no detectable antibody 
levels for the phage employed in these experiments, 
this initial reaction to phage by the animals must be 
attributed to the innate immune system. This study also 
demonstrated only trace amounts of phage in blood 
and organs of the mice that received phage by oral adminis¬ 
tration. 

Based on the results of these experiments demonstra¬ 
ting the ability of the mammalian host defense systems to 
remove phage, a serial-passage selection method was devel¬ 
oped to identify phage variants with a capacity to remain 
for longer periods in the circulatory system (50). This 
system was used in mice to select two E. coli phage X vari¬ 
ants that demonstrated 16,000- and 13,000-fold greater 
capacities to remain in the circulation by evading the 
animal's host defense systems: similar results were 
obtained with the Salmonella typhimurium P22 phage. 
These long-circulating mutant phages were demonstrated 
to be of value in treating animals with bacterial infec¬ 
tions. In these experiments, there was less morbidity 
in E. coli- infected mice treated with the long-circulating X 
phage mutants. In similar experiments conducted with S. 
typhimurium P22 phage, mice infected with Salmonella 
typhimurium also displayed less morbidity and mortality 
when they were treated with the long-circulating mutant 
P22 (50). 


The experiments described above suggested that the 
loss of phage was due to interactions with the RES. In addi¬ 
tion, innate immune system blood factors have also been 
found to be of importance. Sokoloff et al. (70), using a T7 
phage peptide-display library (see chapter 44) found a corre¬ 
lation between the peptides displayed and survival of the 
phage in the rat circulatory system. Phage displaying 
C-terminal lysine or arginine residues had longer circula¬ 
ting half-lives. In addition, in rat serum T 7 phage inactiva¬ 
tion was associated with complement activation. The T7 
phage displaying C-terminal lysine or arginine residues 
were found to be protected from this complement- 
mediated inactivation by binding to the C-reactive protein 
which is normally elevated in rats and mice. However, in 
human serum, phage resistant to inactivation were found 
to display peptides containing tyrosine residues, not 
lysine or arginine as in the rat experiments. In human 
serum the protective protein may be ai-nmcroglobulin 
and not C-reactive protein, as found in the rat and mouse 
serum experiments, because in contrast to the rat, C-reactive 
protein is not normally elevated in human serum (70). 
These T7 phage peptide-library experimental results may 
also help to explain the finding that there was sub¬ 
stitution of a lysine for a glutamic acid residue in the 
E capsid protein in the long-circulating mutant X phage 
used as an antibacterial therapeutic agent in mouse experi¬ 
ments (50). 

In addition to selecting phage that can remain longer 
in the circulatory system, it has been possible, by using 
phage display libraries, to select phage that display specific 
peptide sequences that appear to influence the binding 
or uptake of phage by the vascular endothelium in spe¬ 
cific regions of the body (76). This in vivo screening method 
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has also been employed on one patient in an effort to 
develop a molecular map of the human vasculature (4). 
The ability to target specific regions of the body may be 
useful for the treatment of localized infections. 

In some infections, the pharmacokinetics of the whole 
organism may be secondary to the ability to deliver 
phage intracellularly. This would be particularly impor¬ 
tant in diseases such as tuberculosis in which the intra¬ 
cellular infection of macrophages can serve as a reservoir 
for spread of the infection throughout the body. In this 
regard, Broxmeyer et al. (12) have demonstrated that it is 
possible to deliver the lytic phage TM4 to intracellular 
locations in macrophages by using Mycobacterium smeg- 
matis, an avirulent mycobacterium, as a vector. In their 
experiments they showed that such treatment could 
reduce the titers of Mycobacterium avium or Mycobacterium 
tuberculosis in infected macrophage cultures (12). See 
chapter 38 for a review of Mycobacterium phages. 

Another unique feature concerning the pharma¬ 
cokinetics of phage, unlike most other therapeutic 
agents, is that phage contain a genome. There is evidence 
that phage genomes can gain direct entrance to mamma¬ 
lian cells. It has been reported that phage genomic frag¬ 
ments have been found in mammalian cells following 
oral exposure to phage DNA (21, 65). M13 and X phage 
DNA were found, using polymerase chain reaction, in 
the cells of the gastrointestinal tract, peripheral white 
bloods, and the cells of the liver and spleen. Phage DNA 
could be detected for up to 24 hours in the spleen and 
liver following a single oral dose of phage DNA. However, 
when phage M13 DNA was fed daily for 1 week, Doerfler 
et al. (21) were able to isolate clones containing M13DNA 
from the mouse spleen. One of these clones contained a 
1299 nucleotide fragment of M13 DNA covalently linked 
to an 80 nucleotide DNA segment that had 70% homol¬ 
ogy to the mouse IgE receptor gene (64). In addition, 
when pregnant mice were regularly fed phage M13 DNA, 
evidence of M13 DNA could be detected in the fetuses 
with in situ hybridization methods. In some rare fetal 
cells this M13 DNA appeared to be associated with the 
chromosomes (65). There have also been reports of 
phage-induced enzyme activity in mammalian cells, 
albeit at low levels, following exposure to phage or 
phage DNA (34, 47). The integration of phage DNA into 
the genomes of mammals, as the result of phage therapy 
for an infection, might result in the loss of heterozygosity 
of tumor suppressor genes. However, the effects from 
such events are probably minimal given that phages are 
associated with bacteria in our colon, nose, throat, and 
skin throughout our normal life span. While phage gene 
delivery and expression in mammalian cells may 
normally be rare, phage are currently being genetically 
engineered to enhance these capacities. Such engineered 
phage may be able to serve as vectors for targeted gene 
delivery in mammalian cells (40). 


Phage in Mammalian Host Infected 
with Bacteria 

For many pharmacological agents, information on drug 
distribution and clearance would be sufficient. However, 
phages are not passive pharmacological agents. They are 
capable instead of exponential growth as is the infectious 
agent, the bacterium. A full knowledge of the pharma¬ 
cokinetics of phage antibacterial therapy requires knowl¬ 
edge of three dynamic components and their complex 
interactions: the infected human, the infecting bacterium, 
and the phage. Of these three dynamic components two 
of them, the bacterium and the phage, are capable of expo¬ 
nential growth during the course of the infection and 
its treatment (see chapter 5 for an overview of the dynamic 
interactions of two of these components, the phage and 
bacterium). 

One of the first researchers to recognize the need for 
quantitative data to determine whether phage can sustain 
exponential growth in vivo, Rene Dubos, made use of an 
animal infectious diseases model employing intracerebral 
injections of Shigella dysenteriae (22). The data obtained 
from these experiments demonstrated the multiplication of 
phage, in infected animals treated with phage, at the site 
of infection, the brain (figure 48-3) Concurrently he and 
his coworkers demonstrated that phage treatment was 
capable of rescuing infected animals (survival of untreated 
animals was 3.6% while survival of phage-treated ani¬ 
mals was 72%). These researchers also showed that heat- 
inactivated phage provided no protective effects unless 
the heat-inactivated phage preparations were given days 
before the bacterial infection. They suggested that this 
protective effect may have been due to the activation of 
antibacterial immunity by bacterial products present in 
the phage lysate. 

Thirty-nine years after the publication by Dubos et al. 
(22), Smith and Huggins reported results in similar 
experiments (67). In one of their experiments mice were 
infected intracerebrally with E. coli K1 and in another 
they were infected intramuscularly. In both experiments 
phage were administered intramuscularly. The results of 
one of these studies, in which the mice were infected 
intracerebrally, is illustrated graphically in figure 48-4. 
As in the Dubos et al. (22) study, the phage levels were 
highest in the bacteria-infected tissue, the brain. The 
phage levels fell as the bacterial levels in the brain 
decreased. Unfortunately, all the data in Smith and 
Huggins’ studies was gathered from only 3 animals at 
each time point, so no meaningful statistical analysis 
could be performed. In addition, the graph does not reflect 
the fact that animals were dying during the course of 
this experiment. For example, it is not possible by looking 
at the graph or the data in the tables in the Smith and 
Huggins (67) paper to recognize that 50% of the untreated 
animals died by 72 hours and 75% died by 96 hours. 
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Figure 48-3 Graphical representation of data from the 1943 infectious disease model in which mice inoculated by 
intracerebral injection of the bacteria Shigella dysenteriae (at an LD 50 level) were compared with uninfected control mice. 
All the mice in this experiment were injected with 10 9 plague forming units (PFU) of phage intraperitoneally which 
was administered at the same time as the bacterial inoculation. The bacteriophage level in the blood of the uninfected 
animals was compatible with the dilution of the phage concentration in the total fluid volume of the mouse and the lower 
levels in the brain reflect the relatively smaller blood content in the brain. Flowever, in the infected animals the phage 
particles are observed to increase at the site of the infection, the brain, while the blood levels of phage appear to be 
a “reflection of the events occurring in the brain.” 


bo 





Figure 48-4 Graphical representation of data from phage therapy trial. The data presented are those of Smith and Fluggins 
(67). In this set of experiments all the animals received and intracerebral inoculation of 5 x 10 2 colony forming units (CFU) of 
an E. coli K1. The animals treated with phage were injected with 3 x 10 8 plaque forming units (PFU) of phage intramuscularly 
(into the gastronemius muscle) at the same time as the bacterial inoculation. These graphs were derived from the data 
published in tables 9 and 10 in Smith and Fluggins’ paper. 
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Despite these problems this work has stimulated consider¬ 
able interest and analysis. 

Levin and Bull (43) developed a formal mathematical 
model based on data from the Smith and Huggins (67) 
study. Analysis based on this model resulted in their 
suggestion that the following four elements are critical for 
successful phage therapy: 

1. How “aggressively” phage can lyse bacterial cultures, 
as determined by adsorption rate, burst size, and 
latent period is a critical factor. The particular 
phage that Smith and Huggins found to be most 
effective therapeutically was both specific against the 
K1 capsule and known to lyse bacteria more 
rapidly in culture than a non-Kl specific phage. (See 
chapter 5 for a general discussion of these various 
phage growth parameters in terms of phage ecology 
and the impact of phage on bacteria.) 

2. The limited efficacy of antibiotics may be due to 
the fact that antibiotics decay in the animal whereas 
the phage (if a sufficient population of bacteria is 
present) will multiply and increase. 

3. Even though phage are capable of multiplication, the 
initial dose of phage must be sufficient to control 
the bacterial population before it reaches a lethal 
threshold. 

4. The virulence of the bacterium (how rapidly it can 
multiply in a host) is also relevant. A more virulent 
bacterial strain will require either a higher dose of 
phage or a more virulent phage if the infection is 
to be countered. The fact that the phage-resistant 
E. coli that arose following phage therapy were E. coli 
Kl~ and therefore less virulent may or may not 
have played a role in Smith and Huggins’ investiga¬ 
tions, but it clearly was not favorable to the bacteria. 

In contrast, antibiotic-resistant bacterial-mutants 
are usually not less virulent. 

The importance of modeling the therapeutic use of phage 
was also stressed by Payne and Jansen (58). Their model 
includes terms for the loss of phage due to interaction 
with mammalian systems, such as the RES of the innate 
immune system. Studies using their model suggest that 
the use of antibiotics may at times interfere with phage 
therapy. 

While these models and the data of Smith and Huggins 
provide some insights into the pharmacokinetics of phage 
therapy, it is imperative that statistically correct experi¬ 
ments be performed so that more accurate modeling 
and pharmacological planning can be developed for the 
therapeutic use of phage. Recently developed methods of 
visualizing bacterial infections in live animals using 
bioluminescent strains of bacteria may help in this 
endeavor (27). In this method bacteria containing a lux 
transposon cassette provide bioluminescent bacteria that 


can be followed in live mice by the use of a high-sensitivity 
charge-coupled device camera. The technique has been 
used to follow pneumococcal infections in the lungs of live 
mice. By incorporating an appropriate cassette into a 
phage it should also be possible to follow phage interac¬ 
tions with bacterial infections in vivo (see chapter 46 for 
review of phage-based reporter systems). 


Immunogenicity of Phage 

Phage and the Innate Immune System 

The immune response is dependent on two components: 
the innate and the adaptive immune systems (46). The 
adaptive immune system relies on somatic mutations 
and clonal expansion of T and B cells in response to an infec¬ 
tion. Such clonal expansion requires at least 3-5 days to 
generate a sufficient number of cells to provide an effec¬ 
tive level of antibodies. In contrast, the innate system is 
dependent on evolution for the development of its func¬ 
tions and it is inherited in a Mendelian fashion. It includes 
antimicrobial peptides, the alternative complement path¬ 
way and phagocytes, including those of the organs of the 
RES, primarily the liver and spleen. It is the innate immune 
system that first interacts with a foreign body such as a 
phage when the animal or person has had no prior expo¬ 
sure to this agent. The same mechanism resulted in the 
rapid loss of X phage injected into the circulatory system of 
germ-free mice in Geier et al.'s (30) experiments, as these 
mice had no detectable antibody response to phage X. Like¬ 
wise, in experiments using T4 phage as a probe of the 
innate immune system, it was found that the liver phago- 
cytosed more than 99% of that phage within 30 minutes 
after inoculation and that it removed 12 times as much 
phage as the spleen, as measured by the uptake of 
51 Cr-labeled phage (35). 

To study the role of blood components of the innate 
immune system, Sokoloff et al. (70) preinjected rats with 
GdCl to inhibit phagocytosis by macrophages prior to 
the administration of T7 phage. In these experiments the 
PFU of phage in the circulatory system decreased by 
95-99% within 5 minutes. As only 10% of the PFUs could 
be detected in the liver, 1% in the spleen, and less than 
1 % in the kidneys, lungs, heart, and skeletal muscles it 
was concluded that most of the phage was inactivated in 
the blood. This finding was supported by the fact that 
the half-life of the phage incubated in rat serum at 37°C 
for 30 minutes was determined to be less than 3 minutes. 

Complement was shown to play a major role in this 
phage inactivation by experiments in which complement 
activity was inhibited by cobra venom factor (CFV). When 
CVF was injected intraperitoneally 20 hours before phage 
injection the loss of phage from the rat circulatory system 
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was significantly reduced. In experiments in vitro the recov¬ 
ery of phage after a 30 minute incubation in rat serum 
containing CVF was 50%, compared with less than 1% 
when CVF was not present. 

As previously mentioned, by selecting T7 phage from a 
T7 phage peptide library for its ability to remain for 
longer periods of time in the rat circulatory system 
sokoloff et al. found that the long-circulating trait was 
peptide-specific (70). These long-circulating T7 phage 
were found display peptides with either a C-terminal lysine 
or arginine residue. Such peptides appear to protect the 
phage from complement-mediated inactivation in rat or 
mouse serum by binding to a serum protein. In Sokoloff 
et al.’s rats this “protection” protein is was C-reactive 
protein, which is normally elevated in rats and mice (70). 
This study is consistent with the prior study by Merril et al. 
(50) in which a X phage mutant with a capacity for long 
circulating times in the mouse was found to have a sub¬ 
stitution of a lysine for a glutamic acid in a major capsid 
protein. It should be noted that Sokoloff et al. found 
that the protective protein in human serum may be 
o^-macroglobulin, rather than C-reactive protein, which 
has this function in rats and mice. This may explain why 
phage with tyrosine residues in the displayed peptides were 
protected in human sera rather than those displaying 
C-terminal lysine or arginine residues as in the rat sera 
experiments (70). Experiments such as these suggest that it 
may be possible to select and/or engineer phage for phage 
therapy with resistance to inactivation by components 
in the innate immune system. 

Phage and the Adaptive Immune System 

Many phage are also potent activators (antigens) of the 
adaptive immune system. For the past three decades 
Ochs et al. (56, 57) have made use of this capability of 
phage (j)X174 to probe the human immune system. In 
normal individuals injected phage <j)X174 is cleared within 
3 days and a primary IgM response can be observed that 
peaks 2 weeks after immunization. If a second injection 
is made 6 weeks after the primary immunization, IgM 
and IgG antibody titers increase and peak within 1 week 
and subsequent phage injections result in further incre¬ 
ases in the IgG titers (57). Patients with severe combined 
immune deficiency, characterized by absence of both B 
and T cell functions, display a prolonged clearance of phage, 
with phage present up to 4-6 weeks after the initial 
injection. In addition, these severe combined immune defi¬ 
ciency patients do not develop a detectable antibody 
response to phage. Ochs et al. also noted that while <j)X174 
phage is a potent antigen it causes no recognized toxic 
effects in man (16,56). 

Similar activation of the adaptive immune system was 
observed in mice inoculated with an antivancomycin 
resistant enterococcus phage, ENB6 (10). After the third 


in a series of five monthly injections of phage ENB6, titers 
of IgG and IgM increased above background 3,800-fold and 
5-fold, respectively, and IgG levels did not change substan¬ 
tially after the third injection. No anaphylactic reactions, 
changes in core body temperature, or other adverse events 
were observed in the mice over the course of these multi¬ 
ple injections of phage. It may be possible to develop phage 
that are less antigenic by using phage peptide libraries 
or affinity matrixes made up of antibodies from human 
serum. This type of approach has already been initiated to 
attempt to modulate the immunogenicity of therapeutically 
important enzymes (36). 

Preparation of Phage for 
Therapeutic Usage 

Early phage therapy applications used phages that were 
purified by filter sterilization. This method has proven to be 
insufficient because bacterial debris, including bacterial 
exo- or endotoxins that might be present, can pass through 
such filters. These contaminants can result in increased 
morbidity or in some cases mortality. For example, in a 
recently published study the intraperitoneal inoculation 
of mice, with 10 9 PFU of filter-sterilized X phage lysate, 
grown on E. coli, produced a mild reaction (ruffled fur). 
However, all the mice injected in a similar manner 
with P22 phage lysate, grown on Salmonella typhimurium , 
died within 12 hours after inoculation. The endotoxin levels 
in these preparations was 5 x 10 4 and 5 x 10 5 endotoxin 
units (EU)/ml, respectively, as determined by limulus 
amebocyte lysate assay (50). Additional purification by 
techniques such as equilibrium density centrifugation 
with cesium chloride can separate phage particles from 
debris, including toxins, that does not have the same speci¬ 
fic buoyant density as the phage particles. Such centrifuga¬ 
tion was able to reduce the endotoxin levels in the phage 
preparations discussed above to 0.3 x 10 1 and 1 x 10 3 EU/ml, 
respectively. In contrast to the problems noted with the 
filter-purified phage preparations, no adverse effects were 
noted in mice inoculated intraperitoneally with phage 
preparations purified by cesium chloride equilibrium 
density centrifugation (50). Phage have also been purified 
by precipitation with ammonium sulfate followed by 
separation through anion exchange columns (78). Phage 
prepared in this manner were administered to animals 
without any noticeable ill effects. In addition, Ochs et al. 
(56) used this phage purification method in a number of 
their human protocols. 

Testing for adverse effects associated with phage 
preparations should not be limited to observations of heal¬ 
thy animals. Animals that are stressed may have a lower 
tolerance to endo- and exotoxins. In a recent study a lower 
survival rate was observed in bacteremic mice treated with 
a phage strain (known to have no in vitro activity against 



PHAGE THERAPY 737 


the bacteria associated with the bacteremia) (10). In this 
study, while the highest doses of the phage preparation 
produced no reported adverse effects in healthy animals 
an increased mortality was observed in bacteremic 
stressed mice. This effect was shown to be dependent on the 
phage dose, suggesting that stressed animals may be more 
sensitive to the phage itself or to trace amounts of endo- 
and exotoxins present in the phage preparations than are 
normal animals. 

These experiments suggest that the presence of toxins 
in early phage preparations may explain some of the 
catastrophic results reported in early attempts to use 
phage to treat bacterial infections in humans. In one such 
example, reported in 1932, a phage strain was found that 
could lyse broth cultures of plague (Yersinia pestis) in less 
than 2 hours. However, when a filter-sterilized lysate 
containing this phage stain was injected into rabbits 
that had been experimentally infected with Y. pestis the 
mortality increased to levels above that found in infected 
rabbits that were not treated with phage. Furthermore, 
when this phage preparation was then used to treat 33 
human patients they all died (mortality from plague is 
normally 60-90%) (52). 

While the omission of purification processes may result 
in increased levels of contamination, including toxins, the 
overzealous addition of agents to assure that there are no 
active bacteria present in phage preparations can also be 
detrimental. The association of a “weak” phage prepara¬ 
tion and the presence of organomercury compounds was 
made in a 1932 study of commercial phage preparations 
from a major US pharmaceutical company (74). 

The problems associated with the production of phage 
for clinical use are not insurmountable, as evidenced by 
over three decades of phage use in humans (56, 57). In addi¬ 
tion, animal experiments have provided evidence that 
relatively simple phage purification processes, such as 
cesium chlorided equilibrium density centrifugation, can 
significantly reduce animal morbidity and mortality. 

Methods for the Rapid Determination of 
Phage Specific for Infecting Pathogen 

When a clinician is confronted with a patient with an 
infectious disease the prudent course of action requires 
the determination of the identity of the infectious agent. 
This task can often be time-consuming and laborious, involv¬ 
ing isolation and identification of the causative agent. 
Often, given the time needed to make such a determi¬ 
nation, physicians use their best judgment to choose and 
administer a relatively broad-range antibiotic that is effec¬ 
tive for the suspected bacterial strain while they wait for 
the culture- and antibiotic-sensitivity results. In contrast, if 
phages are to be used in place of antibiotics it is critical 
to actually determine the strain of phage to be used, 


given the generally narrow host range displayed by most 
phage. Such determinations using current technology could 
easily take days to perform. If phage are to be used as 
therapeutic antibacterial agents then a rapid and inexpen¬ 
sive method to determine the nature of the infectious 
bacteria and its phage susceptibility is needed. 

One approach that can be taken to this problem is 
based on the use of modified phage containing reporter 
genes (see chapter 46). In this method phage are first iso¬ 
lated and identified as being potential therapeutic agents 
for a particular species or strain of bacteria. These phage 
are genetically engineered to encode a reporter gene such 
that a characteristic color or marker will be produced 
when the phage infects the specific bacterial strain that 
is susceptible to that phage. For example, if different 
strains of phage carrying the luciferase reporter gene 
were placed in a multiwell plate, along with an aliquot 
of a clinical isolate, the emission of light from any of the 
wells would serve to identify the bacterial strain in the 
clinical isolate as well as the phage strain that may be 
used to treat the bacterial infection. Such testing could 
be performed in hours, instead of the days that tradi¬ 
tional culturing methods require. This approach has been 
used with phage carrying the luciferase reporter gene to 
detect Listeria contamination in foods (45). A similar 
approach has also been developed for a rapid and relatively 
inexpensive diagnostic test for Mycobacteria infections in 
patients suspected to have tuberculosis (15). Alternatively, 
given that lysis of bacterial strains by phage will result 
in the discharge of adenylate kinase which can convert 
ADP in the reaction mix to ATP, and that luciferin/ 
luciferase can utilize the ATP for light emission, placing 
luciferin/luciferase in the incubation mixture will serve to 
identify the organism and the appropriate phage without 
the need to genetically endow the phage with a luciferase 
reporter gene (66). 

Recent advances in mass spectrometry may also 
provide fast methods for the identification of bacterial 
strains. It is now possible to rapidly identify and type 
bacterial strains based on their lipid, protein, and nucleic 
acid mass fragment “fingerprints” (41, 79). While mass 
spectrometry is currently being developed to identify bacte¬ 
rial strains it might also be possible to use this approach 
to determine whether the bacteria are susceptible to a 
given phage. However, such information is not currently 
available and it may be impractical to gain sufficient 
knowledge of bacterial mass “fingerprints” to accurately 
determine when a particular phage could be useful. 
Alternatively, one could use phage gene-product expression 
for the development of markers both for bacterial identi¬ 
fication and as an indicator of phage susceptibility. In this 
approach one could use mass spectrometry. First a clinical 
sample would be placed in growth medium to amplify the 
bacteria, followed by infection with selected “therapeutic” 
phage strains. If the infecting bacteria were susceptible 
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to the phage, mass spectrometry would then detect signa¬ 
ture fragments of proteins that are expressed only when 
infection of the bacteria occurs by a specific phage. Signa¬ 
ture fragments, that are not part of the phage virion, would 
be generated from phage-specific proteins such as RNA poly¬ 
merase, regulatory protein, or lytic enzyme. This approach 
could be used so that the generation and detection of 
phage-specific products serves the same role as “reporter” 
gene products in a manner analogous to the detection 
of marker gene products described above. 

DNA microarrays, in conjunction with polymerase 
chain reaction amplification, are also being developed for 
the rapid diagnosis of bacterial strains. This technology 
can be used to determine the susceptibility of bacterial 
strains to certain antibiotics (31). In principle it may be possi¬ 
ble to also develop such a method for the determination 
of bacterial strains and their phage susceptibility. 

Therapeutic Use of Phage Products 
and Components 

Phage gene products and components may also serve as 
antibacterial therapeutic agents. While such applications 
lose the exponential growth capacity of phage they may 
still be highly effective. For example, it has been suggested 
that phage-encoded polypeptides could be developed into 
a new class of antibiotics (9). This suggestion is based on 
the recognition that the small-genome phages <f>X174 
and OP encode polypeptides that interfere with bacterial 
wall biosynthesis and that such inhibition can results in 
bacterial lysis (see chapters 11 and 15, respectively, for 
reviews of <f>X174 and QP biology). In another use of 
phage gene products it has been demonstrated that 
phage-encoded endolysins, which disrupt the peptidoglycan 
matrix of the bacterial cell wall, and phage holins, which 
permeabilize bacterial membranes, can also serve as 
effective antibacterial agents (see chapter 10 for review of 
phage lysis proteins). A single dose of a phage lysin speci¬ 
fic for streptococci groups A, C, and E was capable of 
clearing these bacteria both in vitro and in vivo in mouse 
upper respiratory infections (54). As this lysin has little if 
any affect on other commensal organisms, it should be 
less disruptive of the oral and upper respiratory ecology 
than most antibiotic treatments. A similar result was 
obtained when a phage encoded enzyme, PlyG lysine 
(enclosed by the y phage of Bacillus anthracis), was used to 
rescue mice infected with Bacillus cereus, a bacterial 
strain that is closely related to Bacillus anthracis (66). 
No resistant B. cereus strains were detected following 
such treatment. In addition, as ATP is released when PlyG 
lysine destroys B. anthracis, this enzyme in conjunction 
with luciferin/luciferase can also be used to rapidly detect 
y-sensitive bacilli and their germinating spores. In this 


application, spores are detected by first immobilizing them 
on filter membranes. They are then incubated with 
germinant and treated with PlyG lysine and luciferin/ 
luciferase. Emitted light is detected using a hand-held 
luminometer. This system was able to detect as few as 100 
spores (66). 

Phage-encoded lysin enzymes have also been used 
prophylactically. Gaeng et al. developed a bacterial strain 
that secretes the functional bacteriophage lysin enzymes 
Ply 511 and Plyll8. They used this bacterial strain to elimi¬ 
nate Listeria monocytogenes from dairy starter cultures 
(used in the production of cheese) (29). 

Concluding Remarks 

While phage therapy has been employed continually 
since the initial discoveries of these viruses at the begin¬ 
ning of the twentieth century, these clinical applications 
have never faced the scrutiny now required in countries 
that require certification of pharmacological agents. Such 
certification is based on the results of studies of efficacy 
and pharmacokinetics in animals and humans. As dis¬ 
cussed, there are a number of historical reasons for this 
deficiency including the overshadowing discovery of the 
antibiotics with their broader antibacterial host range. 
However, phage deserve careful review as they may provide 
ideal therapeutic agents for the treatment of emerging anti- 
biotic-resistant bacterial strains and, as Lederburg (42) 
suggested, for the treatment of epidemics such as cholera 
in refugee camps. They may prove especially useful in agri¬ 
cultural applications where their high specificity can be 
used to treat a bacterial infection without disturbing the 
larger ecological systems, as is often the problem with 
broad bacterial host range antibiotics (42). These sugges¬ 
tions are strengthened by the recent observations that 
many antibiotic-resistant bacterial strains are arising 
through clonal selection. In recognition of this grow¬ 
ing problem, the US Food and Drugs Administration (FDA) 
recently announced that it is re-evaluating livestock 
antibiotics currently on the market. In this regard, the 
FDA is now requiring manufacturers of proposed livestock 
pharmaceuticals to determine whether newly proposed 
antibiotics will be associated with the emergence of patho¬ 
genic organisms with resistance to drugs currently in use 
for the treatment of human diseases (37). In a comment in 
Nature concerning presentation at the American Society 
for microbiology meeting in Salt Lake city in 2002, 
knight noted the lack of genetic variability in antibiotic- 
resistant bacteria. He cited evidence presented by Klugman 
that only 10 strains of pneumococcus are associated 
with 75% of the cases of antibiotic-resistant childhood 
pneumonia and one half of these cases are caused by a 
single strain, “Spain 23-E.” Similar results were reported 
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at a meeting by Herminia de Lencastre for a study of 
methicillin-resistant Staphylococcus aureus (MRSA). In this 
study only Five strains of MRSA were found in 70% of 
3000 clinical isolates from 14 countries (38). This lack of 
genetic variability in antibiotic-resistant bacteria suggests 
that such pathogenic bacteria may offer ideal targets for 
phage therapy. 

In addition to the potential of phage for the treatment 
of antibiotic-resistant bacterial infections, phage with 
their generally narrow host range may be better suited 
than the currently employed antibiotics for a number of 
clinical applications. For example, in the treatment of bacte¬ 
rial infections, phage can be targeted toward the specific 
pathogen without disturbing complex bacterial ecological 
systems such as those associated with the human gastro¬ 
intestinal system. Applications of phage to treat infections 
may eliminate the iatrogenic effects of antibiotics such as 
the antibiotic-related diarrhea diseases that range from 
“nuisance” diarrhea to colitis associated with Clostridium 
difficile infections (8). 

Despite the clear need for, and in some cases advan¬ 
tages, of, phage therapy there may be some concern over 
the view of regulatory agencies concerning approval of 
phage as an antibiotic therapy. However, these agencies are 
acutely aware of the problems associated with antibio- 
tic-resistant organisms and the need for new approaches 
to this problem. As for the safety of phage therapy, humans 
are normally in contact with phage throughout their life, 
given the complex interactions of bacteria and these 
viruses in the colon and upper respiratory system, and on 
the skin. In this regard, many of the phages in current 
collections were isolated from human waste. While some 
phage carry harmful genes, it should be possible to eliminate 
these phage or those genes from our collections of thera¬ 
peutic phage. In addition, it should be noted that Hans Ochs 
and his colleagues have been using phage as a means to 
determine the extent of immune deficiencies and as a 
probe of the immune system in human studies for the 
past three decades (56, 57). One additional fact that may 
be of interest is that many vaccines for human consump¬ 
tion were found, in the 1970s, to be contaminated with 
phage (from contaminated fetal calf serum used to pro¬ 
duce these vaccines). Despite this contamination president 
Nixon issued an Executive order to permit their continued 
use (48,49). 

Development of therapeutic phage will require a 
commitment to fulfill the scientific requirements required 
of current pharmaceutical agents. In this effort the years 
of experience gained from the use of phage to discover 
many of the basic tenets of molecular biology should 
prove to be an asset. This information — in addition to the 
encouraging results of recent controlled animal experi¬ 
ments demonstrating the capacity of phage to rescue 
animals with life-threatening infections — suggests that 


such an effort may result in the development of needed 
antibacterial therapeutic agents. 
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